df
Source Target volume
0 A B 3
1 B C 1
2 C A 3
3 B A 1
As all the calculations has to be done independent of Source
, Target
. So, for calculations, it is better to take it as a same column. So, concatenating vertically Source-Volume
table and Target-Volume
table
temp_df = pd.concat([df[['Source', 'volume']].rename(columns={'Source': 'Source_Target'}),
df[['Target', 'volume']].rename(columns={'Target': 'Source_Target'})], ignore_index=True)
temp_df
Source_Target volume
0 A 3
1 B 1
2 C 3
3 B 1
4 B 3
5 C 1
6 A 3
7 A 1
Now, you can get all the calulations using groupby
and agg
temp_df = temp_df.groupby('Source_Target')['volume'].agg(
count = 'count',
min = 'min',
mean = 'mean',
max = 'max'
)
temp_df
Finally, merge df
with the temp_df
based on Source
and then based on Target
.
(df
.drop('volume', axis=1)
.merge(temp_df, left_on='Source', right_on='Source_Target', how='left') # 'left' to retain order
.reindex(['Source', 'count', 'min', 'mean', 'max', 'Target'], axis=1)
.rename(columns={'count': 'Source count', 'min': 'S .min', 'max': 'S .max', 'mean': 'S .mean'})
.merge(temp_df, left_on='Target', right_on='Source_Target', how='left')
.rename(columns={'count': 'Target count', 'min': 'T .min', 'max': 'T .max', 'mean': 'T .mean'}))
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…