I'm putting around 4 million different keys into a Python dictionary.
Creating this dictionary takes about 15 minutes and consumes about 4GB of memory on my machine. After the dictionary is fully created, querying the dictionary is fast.
I suspect that dictionary creation is so resource consuming because the dictionary is very often rehashed (as it grows enormously).
Is is possible to create a dictionary in Python with some initial size or bucket number?
My dictionary points from a number to an object.
class MyObject:
def __init__(self):
# some fields...
d = {}
d[i] = MyObject() # 4M times on different key...
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…