I have 2 dataframes df1 and df2 of different size.
df1 = pd.DataFrame({'A':[np.nan, np.nan, np.nan, 'AAA','SSS','DDD'], 'B':[np.nan,np.nan,'ciao',np.nan,np.nan,np.nan]})
df2 = pd.DataFrame({'C':[np.nan, np.nan, np.nan, 'SSS','FFF','KKK','AAA'], 'D':[np.nan,np.nan,np.nan,1,np.nan,np.nan,np.nan]})
My goal is to identify the elements of df1 which do not appear in df2.
I was able to achieve my goal using the following lines of code.
df = pd.DataFrame({})
for i, row1 in df1.iterrows():
found = False
for j, row2, in df2.iterrows():
if row1['A']==row2['C']:
found = True
print(row1.to_frame().T)
if found==False and pd.isnull(row1['A'])==False:
df = pd.concat([df, row1.to_frame().T], axis=0)
df.reset_index(drop=True)
Is there a more elegant and efficient way to achieve my goal?
Note: the solution is
A B
0 DDD NaN
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…