I've a DF like this
UNIT EXITSn_hourly Interval
1867 R081 104 00:00:00-04:00:00
1868 R081 0 04:00:00-04:00:00
1869 R081 129 04:00:00-08:00:00
1870 R081 521 08:00:00-12:00:00
1871 R081 1048 12:00:00-16:00:00
2838 R032 38 00:00:00-04:00:00
2839 R032 0 04:00:00-04:00:00
2840 R032 89 04:00:00-08:00:00
2841 R032 470 08:00:00-12:00:00
I need to delete entire row when Interval has this particular format
1868 R081 0 04:00:00-04:00:00
I not only want to remove 04:00:00-04:00:00
but also such similar values like
01:00:00-01:00:00
Actually this is my original df. I created an Interval
C/A UNIT SCP DATEn TIMEn DESCn ENTRIESn EXITSn
0 A002 R051 02-00-00 06-29-13 00:00:00 REGULAR 4174592 1433672
1 A002 R051 02-00-00 06-29-13 04:00:00 REGULAR 4174628 1433675
2 A002 R051 02-00-00 06-29-13 08:00:00 REGULAR 4174641 1433706
3 A002 R051 02-00-00 06-29-13 12:00:00 REGULAR 4174741 1433775
4 A002 R051 02-00-00 06-29-13 16:00:00 REGULAR 4174936 1433826
5 A002 R051 02-00-00 06-29-13 20:00:00 REGULAR 4175270 1433877
6 A002 R051 02-00-00 06-30-13 00:00:00 REGULAR 4175403 1433908
7 A002 R051 02-00-00 06-30-13 04:00:00 REGULAR 4175441 1433914
8 A002 R051 02-00-00 06-30-13 08:00:00 REGULAR 4175457 1433928
9 A002 R051 02-00-00 06-30-13 12:00:00 REGULAR 4175520 1433981
I created interval using this code
import copy
df = copy.deepcopy(turnstile_data)
pdf = df.shift(periods=1)
df['ENTRIESn_hourly'] = df['ENTRIESn'] - pdf['ENTRIESn'].fillna(0)
df['EXITSn_hourly'] = df['EXITSn'] - pdf['EXITSn'].fillna(0)
df['Interval'] = pdf['TIMEn']+'-'+ df['TIMEn'].fillna(0)
df.loc[(df['ENTRIESn'] == 0), 'ENTRIESn_hourly'] = 0
df.loc[(df['EXITSn'] == 0), 'EXITSn_hourly'] = 0
df.loc[(df['C/A'] != pdf['C/A']) | (df['UNIT'] != pdf['UNIT']) | (df['SCP'] != pdf['SCP']), ['ENTRIESn_hourly', 'EXITSn_hourly','Interval']] = 0
df = df[df.Interval != 0]
print df.head(20)
head7=copy.deepcopy(df)
required_df=head7[['UNIT','EXITSn_hourly','Interval']].groupby(head7.UNIT)
print required_df.head(5)
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…