Try using shift(-1)
to compare rows, then change the tail(1)
to np.nan
. Group by your keyInd, and then do the analysis on each grouping. This should avoid row-wise looping.
def transition(x):
t = np.where(x['V2010']==x['V2010'].shift(-1), 0, 1)
x['transition'] = np.cumsum(t)
x['transition'] = x['transition'].astype('float')
x['transition'].iat[-1] = np.nan
return x
dft = df.groupby('keyInd').apply(transition)
Output:
In [105]: dft
Out[105]:
keyInd V1016 V2010 transition
0 110000016107-1 1 4 0.000
1 110000016107-1 2 4 0.000
2 110000016107-1 3 4 0.000
3 110000016107-1 4 4 1.000
4 110000016107-1 5 2 NaN
5 110000016107-2 1 1 1.000
6 110000016107-2 2 4 2.000
7 110000016107-2 3 3 2.000
8 110000016107-2 4 3 3.000
9 110000016107-2 5 2 NaN
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…