Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.0k views
in Technique[技术] by (71.8m points)

python - Pandas find values from another data frame

I have two data frames that i would like to combine. The first one is ~3000 rows long has 3 columns: source and a target and a volume containing multiple source and target combinations:

Source Target volume
source 1 target 1 3
source 2 target 2 1

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)
df
    Source  Target  volume
0   A       B       3
1   B       C       1
2   C       A       3
3   B       A       1

As all the calculations has to be done independent of Source, Target. So, for calculations, it is better to take it as a same column. So, concatenating vertically Source-Volume table and Target-Volume table

temp_df = pd.concat([df[['Source', 'volume']].rename(columns={'Source': 'Source_Target'}),
                     df[['Target', 'volume']].rename(columns={'Target': 'Source_Target'})], ignore_index=True)
temp_df
  Source_Target volume
0      A            3
1      B            1
2      C            3
3      B            1
4      B            3
5      C            1
6      A            3
7      A            1

Now, you can get all the calulations using groupby and agg

temp_df = temp_df.groupby('Source_Target')['volume'].agg(
    count = 'count',
    min = 'min',
    mean = 'mean',
    max = 'max'
)
temp_df

enter image description here

Finally, merge df with the temp_df based on Source and then based on Target.

(df
 .drop('volume', axis=1)
 .merge(temp_df, left_on='Source', right_on='Source_Target', how='left') # 'left' to retain order
 .reindex(['Source', 'count', 'min', 'mean', 'max', 'Target'], axis=1)
 .rename(columns={'count': 'Source count', 'min': 'S .min', 'max': 'S .max', 'mean': 'S .mean'})
 .merge(temp_df,  left_on='Target', right_on='Source_Target', how='left')
 .rename(columns={'count': 'Target count', 'min': 'T .min', 'max': 'T .max', 'mean': 'T .mean'}))

enter image description here


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...