You can use pandas concatenation with index matching, it's very fast.
prod = "aa|b|c|d|e".split('|')
preprod = "c|d|e".split('|')
test = "b|d|e|f".split('|')
dev = "aa|e|g".split('|')
df = pd.concat([
pd.DataFrame({'prod': 1}, index=np.unique(prod)),
pd.DataFrame({'preprod': 1}, index=np.unique(preprod)),
pd.DataFrame({'test': 1}, index=np.unique(test)),
pd.DataFrame({'dev': 1}, index=np.unique(dev))
], axis=1, sort=False).fillna(0).reset_index().rename(columns={'index': 'id'})
print(df)
>>>
id prod preprod test dev
0 aa 1.0 0.0 0.0 1.0
1 b 1.0 0.0 1.0 0.0
2 c 1.0 1.0 0.0 0.0
3 d 1.0 1.0 1.0 0.0
4 e 1.0 1.0 1.0 1.0
5 f 0.0 0.0 1.0 0.0
6 g 0.0 0.0 0.0 1.0
and for speed;
prod = np.random.randint(10000000, size=10000000).astype(str)
preprod = np.random.randint(10000000, size=1000000).astype(str)
test = np.random.randint(10000000, size=1000000).astype(str)
dev = np.random.randint(10000000, size=100000).astype(str)
%%time
df = pd.concat([
pd.DataFrame({'prod': 1}, index=np.unique(prod)),
pd.DataFrame({'preprod': 1}, index=np.unique(preprod)),
pd.DataFrame({'test': 1}, index=np.unique(test)),
pd.DataFrame({'dev': 1}, index=np.unique(dev))
], axis=1, sort=False).fillna(0).reset_index().rename(columns={'index': 'id'})
>>> Wall time: 32.3 s
on my humble laptop.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…