Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
467 views
in Technique[技术] by (71.8m points)

python - Pandas extract numbers from column into new columns

I currently have this df where the rect column is all strings. I need to extract the x, y, w and h from it into separate columns. The dataset is very large so I need an efficient approach

df['rect'].head()
0    <Rect (120,168),260 by 120>
1    <Rect (120,168),260 by 120>
2    <Rect (120,168),260 by 120>
3    <Rect (120,168),260 by 120>
4    <Rect (120,168),260 by 120>

So far this solution works however it's very messy as you can see

df[['x', 'y', 'w', 'h']] = df['rect'].str.replace('<Rect (', '').str.replace('),', ',').str.replace(' by ', ',').str.replace('>', '').str.split(',', n=3, expand=True)

Is there a better way? Possibly a regex approach

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Using extractall

df[['x', 'y', 'w', 'h']] = df['rect'].str.extractall('(d+)').unstack().loc[:,0]
Out[267]: 
match    0    1    2    3
0      120  168  260  120
1      120  168  260  120
2      120  168  260  120
3      120  168  260  120
4      120  168  260  120

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...