Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
613 views
in Technique[技术] by (71.8m points)

python - Why does list ask about __len__?

class Foo:
    def __getitem__(self, item):
        print('getitem', item)
        if item == 6:
            raise IndexError
        return item**2
    def __len__(self):
        print('len')
        return 3

class Bar:
    def __iter__(self):
        print('iter')
        return iter([3, 5, 42, 69])
    def __len__(self):
        print('len')
        return 3

Demo:

>>> list(Foo())
len
getitem 0
getitem 1
getitem 2
getitem 3
getitem 4
getitem 5
getitem 6
[0, 1, 4, 9, 16, 25]
>>> list(Bar())
iter
len
[3, 5, 42, 69]

Why does list call __len__? It doesn't seem to use the result for anything obvious. A for loop doesn't do it. This isn't mentioned anywhere in the iterator protocol, which just talks about __iter__ and __next__.

Is this Python reserving space for the list in advance, or something clever like that?

(CPython 3.6.0 on Linux)

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

See the Rationale section from PEP 424 that introduced __length_hint__ and offers insight on the motivation:

Being able to pre-allocate lists based on the expected size, as estimated by __length_hint__ , can be a significant optimization. CPython has been observed to run some code faster than PyPy, purely because of this optimization being present.

In addition to that, the documentation for object.__length_hint__ verifies the fact that this is purely an optimization feature:

Called to implement operator.length_hint(). Should return an estimated length for the object (which may be greater or less than the actual length). The length must be an integer >= 0. This method is purely an optimization and is never required for correctness.

So __length_hint__ is here because it can result in some nice optimizations.

PyObject_LengthHint, first tries to get a value from object.__len__ (if it is defined) and then tries to see if object.__length_hint__ is available. If neither is there, it returns a default value of 8 for lists.

listextend, which is called from list_init as Eli stated in his answer, was modified according to this PEP to offer this optimization for anything that defines either a __len__ or a __length_hint__.

list isn't the only one that benefits from this, of course, bytes objects do:

>>> bytes(Foo())
len
getitem 0
...
b'x00x01x04x10x19'

so do bytearray objects but, only when you extend them:

>>> bytearray().extend(Foo())
len
getitem 0
...

and tuple objects which create an intermediary sequence to populate themselves:

>>> tuple(Foo())
len
getitem 0
...
(0, 1, 4, 9, 16, 25)

If anybody is wandering why exactly 'iter' is printed before 'len' in class Bar and not after as happens with class Foo:

This is because if the object in hand defines an __iter__ Python will first call it to get the iterator, thereby running the print('iter') too. The same doesn't happen if it falls back to using __getitem__.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...