python - create a variable iterating over a column in a large dataset in pandas

Question

Welcome To Ask or Share your Answers For Others

python - create a variable iterating over a column in a large dataset in pandas

asked Oct 6, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - create a variable iterating over a column in a large dataset in pandas

I have to create a variable named transition to a dataframe which adds 1 to every change in the variable V2010 of each KeyInd.

Here is a sample of the dataframe:

keyInd	V1016	V2010
110000016107-1	1	4
110000016107-1	2	4
110000016107-1	3	4
110000016107-1	4	4
110000016107-1	5	2
110000016107-2	1	1
110000016107-2	2	4
110000016107-2	3	3
110000016107-2	4	3
110000016107-2	5	2

question from:https://stackoverflow.com/questions/66068143/create-a-variable-iterating-over-a-column-in-a-large-dataset-in-pandas

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-06T03:01:25+0000

Try using shift(-1) to compare rows, then change the tail(1) to np.nan. Group by your keyInd, and then do the analysis on each grouping. This should avoid row-wise looping.

def transition(x):
    t = np.where(x['V2010']==x['V2010'].shift(-1), 0, 1)
    x['transition'] = np.cumsum(t)
    x['transition'] = x['transition'].astype('float')
    x['transition'].iat[-1] = np.nan
    return x

dft = df.groupby('keyInd').apply(transition)

Output:

In [105]: dft
Out[105]:
           keyInd  V1016  V2010  transition
0  110000016107-1      1      4       0.000
1  110000016107-1      2      4       0.000
2  110000016107-1      3      4       0.000
3  110000016107-1      4      4       1.000
4  110000016107-1      5      2         NaN
5  110000016107-2      1      1       1.000
6  110000016107-2      2      4       2.000
7  110000016107-2      3      3       2.000
8  110000016107-2      4      3       3.000
9  110000016107-2      5      2         NaN

Categories

python - create a variable iterating over a column in a large dataset in pandas

python - create a variable iterating over a column in a large dataset in pandas

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags