Python: Do Python Lists keep a count for len() or does it count for each call?

Question:

If I keep calling len() on a very long list, am I wasting time, or does it keep an int count in the background?

Asked By: PKKid

||

Answers:

It has to store the length somewhere, so you aren’t counting the number of items every time.

Answered By: Georg Schölly

Don’t worry: Of course it saves the count and thus len() on lists is a pretty cheap operation. Same is true for strings, dictionaries and sets, by the way!

Answered By: Ferdinand Beyer

Write your program so that it’s optimised for clarity and easily maintainable. Is your program clearer with a call to len(foo)? Then do that.

Are you worried about the time taken? Use the timeit module in the standard library to measure the time taken, and see whether it is significant in your code.

You will, like most people, very likely be wrong in your guesses about which parts of your program are slowest. Avoid the temptation to guess, and instead measure it to find out.

Remember that premature optimisation is the root of all evil, in the words of Donald Knuth. Only focus on the speed of code that you have measured the speed of, to know whether the benefit would be worth the cost of changing how it works.

Answered By: bignose

A Python “list” is really a resizeable array, not a linked list, so it stores the size somewhere.

Answered By: Ken

The question has been answered (len is O(1)), but here’s how you can check for yourself:

$ python -m timeit -s "l = range(10)" "len(l)"
10000000 loops, best of 3: 0.119 usec per loop
$ python -m timeit -s "l = range(1000000)" "len(l)"
10000000 loops, best of 3: 0.131 usec per loop

Yep, not really slower.

Answered By: dF.

And one more way to find out how it’s done is to look it up on Google Code Search look at the source on GitHub, if you don’t want to download the source yourself.

static Py_ssize_t list_length(PyListObject *a)
{
    return a->ob_size;
}
Answered By: AKX

I’m quite late to the party, but I just did a test and got some interesting results:

import timeit, statistics

test_arr = list(range(1000000))

# Just call len() every time
def test1(arr):
    for i in range(len(arr)):
        j = arr[len(arr)-1]

# Store the result of len()
def test2(arr):
    l = len(arr)
    for i in range(len(arr)):
        j = arr[l-1]

print("Running test 1...")
t1 = timeit.repeat(lambda: test1(test_arr), number=int(2e2), repeat=4)
print("Test 1 results:")
print(t1)

print("Running test 2...")
t2 = timeit.repeat(lambda: test2(test_arr), number=int(2e2), repeat=4)
print("Test 2 results:")
print(t2)

m1 = statistics.mean(t1)
m2 = statistics.mean(t2)
avg = round(100 * abs(m2-m1)/m1, 2)
sign = "quicker" if m1-m2 > 0 else "slower"
print(f"On average, test 2 was {avg} {sign} than test 1")

Results:

Running test 1...
Test 1 results:
[15.189714099979028, 15.160469999769703, 15.17295220005326, 15.072236399864778]
Running test 2...
Test 2 results:
[8.17264029989019, 8.191439799964428, 8.171442999970168, 8.208112000022084]
On average, test 2 was 45.96 quicker than test 1

It’s a very crude test and I’m sure somebody can improve upon it (or point out a mistake), but it appears that it is indeed significantly quicker to call len only once and save it to a variable, rather than calling it repeatedly (which I believe is what the poster was asking).

Answered By: Trevor Gross
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.