python - How to sequence row based on another row?

Question

Welcome To Ask or Share your Answers For Others

python - How to sequence row based on another row?

asked Jan 31, 2022 in Technique[技术] by 深蓝 (71.8m points)

python - How to sequence row based on another row?

I am trying to convert a formula from excel to pandas.

The DataFrame looks like this:

Column A    Column B 
H  
H  
H  
J  
J  
J  
J  
K  
K

I want to fill column B to increment while the value in column A remains the same. In the example above, this would be:

Column A     Column B
H            1
H            2
H            3
J            1
J            2
J            3
J            4
K            1
K            2

In excel, the formula would be =IF(A2<>A1,1,B1+1)

How can I apply this formula in pandas?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2022-01-31T07:26:52+0000

This can be done using the following vectorised method:

Code:

>>> df = pd.DataFrame({'A':['H', 'H', 'H', 'J', 'J', 'J', 'J', 'K', 'K']})
>>> df['B'] = df.groupby((df['A'].shift(1) != df['A']).cumsum()).cumcount() + 1

Output:

Explanation:

First, we use df['A'].shift(1) != df['A'] to compare column A with column A shifted by 1. This yields:

>>> df['A'] != df['A'].shift(1)
0     True
1    False
2    False
3     True
4    False
5    False
6    False
7     True
8    False
Name: A, dtype: bool

Next, we use cumsum() to return the cumulative sum over that column. This gives us:

>>> (df['A'] != df['A'].shift(1)).cumsum()
0    1
1    1
2    1
3    2
4    2
5    2
6    2
7    3
8    3
Name: A, dtype: int32

Now, we can use GroupBy.cumcount() as usual to enumerate each item in ascending order, adding 1 to start the index at 1. Note that we can't just use

df.groupby('A').cumcount()

Because if, for example, we had:

>>> df
   A
0  H
1  H
2  H
3  J
4  J
5  J
6  J
7  K
8  K
9  H

This would give us:

>>> df.groupby('A').cumcount() + 1
0    1
1    2
2    3
3    1
4    2
5    3
6    4
7    1
8    2
9    4
dtype: int64

Note that the final row is 4 and not 1 as expected.

Categories

python - How to sequence row based on another row?

python - How to sequence row based on another row?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags