Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
863 views
in Technique[技术] by (71.8m points)

python - Pandas validate date format

Is there any nice way to validate that all items in a dataframe's column have a valid date format?

My date format is 11-Aug-2010.

I saw this generic answer, where:

try:
    datetime.datetime.strptime(date_text, '%Y-%m-%d')
except ValueError:
    raise ValueError("Incorrect data format, should be YYYY-MM-DD")

source: https://stackoverflow.com/a/16870699/1374488

But I assume that's not good (efficient) in my case.

I assume I have to modify the strings to be pandas dates first as mentioned here: Convert string date time to pandas datetime

I am new to the Python world, any ideas appreciated.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

(format borrowed from piRSquared's answer)

if pd.to_datetime(df['date'], format='%d-%b-%Y', errors='coerce').notnull().all():
    # do something 

This is the LYBL—"Look Before You Leap" approach. This will return True assuming all your date strings are valid - meaning they are all converted into actual pd.Timestamp objects. Invalid date strings are coerced to NaT, which is the datetime equivalent of NaN.

Alternatively,

try:
    pd.to_datetime(df['date'], format='%d-%b-%Y', errors='raise')
    # do something
except ValueError:
    pass

This is the EAFP—"Easier to Ask Forgiveness than Permission" approach, a ValueError is raised when invalid date strings are encountered.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...