Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
905 views
in Technique[技术] by (71.8m points)

python - How to cross checking 2 pandas dataframes file and use 1 dataframe's value as a variable?

I have 2 pandas dataframes:

modal2:

   Mode month1  month2  month3  month4  month5  month6  month7  month8  month9  month10 month11 month12
0   100 0   0   0   0   0   0   0   0   0   0   0   0
2   602 0   2   1   0   2   1   0   2   1   0   2   1
3   603 1   0   2   1   0   2   1   0   2   1   0   2
11  802 0   11  10  9   8   7   6   5   4   3   2   1

df_ia:

      RevalMonth_plus  Mode
0     1                602
35    1                100
52    4                100
79    1                802 
94    4                603
95    4                603
96    4                603 
98    1                100 

the logic is compare each row of df_ia to each row of modal2,if row of df_ia has the same Mode number as row of modal2 , then add the column month1 of modal2 to df_ia.

I can successfully complete this part by the following code:

 for index, row in modal2.iterrows():
        conditions.append(df_ia['Mode'] == row['Mode'])


    for col in ['month1']: #for loop list is because sometime need more than 1 column  
        col_v = modal2[col]
        df_ia[col] = np.select(conditions, col_v, default=None)

the output is:

      RevalMonth_plus  Mode  month1
0     1                602   0
35    1                100   0
52    4                100   0
79    1                802   0 
94    4                603   1
95    4                603   1
96    4                603   1 
98    1                100   0  

But the challenge part is:

how can I replace 'month1' to the corresponding 'RevalMonth_plus' from df_ia, just like this:

for col in [f'month{df_ia['RevalMonth_plus']}']: #for loop list is because sometime need more than 1 column  
    col_v = modal2[col]
    df_ia[col] = np.select(conditions, col_v, default=None)

The ideal output should like:

      RevalMonth_plus  Mode  dynamic_moth_value
0     1                602   0                  #month1
35    1                100   0                  #month1
52    4                100   0                  #month4
79    1                802   0                  #month1
94    4                603   1                  #month4
95    4                603   1                  #month4
96    4                603   1                  #month4
98    1                100   0                  #month1

I have spent a whole day on it.But cannot make it.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Another solution:

x = pd.merge(df_ia, modal2, on="Mode", how="left")
x["dynamic_month_value"] = x.apply(
    lambda x: x["month" + str(x["RevalMonth_plus"])], axis=1
)
print(x[["RevalMonth_plus", "Mode", "dynamic_month_value"]])

Prints:

   RevalMonth_plus  Mode  dynamic_month_value
0                1   602                    0
1                1   100                    0
2                4   100                    0
3                1   802                    0
4                4   603                    1
5                4   603                    1
6                4   603                    1
7                1   100                    0

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...