Preallocating arrays when using the `array` library in Python

Question:

In the array library in Python, what’s the most efficient way to preallocate with zeros (for example for an array size that barely fits into memory)?

Creating a huge list first would partially defeat the purpose of choosing the array library over lists for efficiency.

Is it possible to preallocate even faster, without spending time on overwriting whatever there was in memory?

(Preallocating is important, because then the code is faster than growing an array iteratively, as discussed here.)

Asked By: root

||

Answers:

Use sequence multiplication, just like you would to create a giant list, but with an array:

arr = array.array(typecode, [0]) * size_needed

Note the position of the parentheses. We directly multiply a 1-element array by an element count, rather than multiplying a 1-element list by an element count and then converting to an array.

Answered By: user2357112

Using the * operator is probably best. A much less efficient method:

from array import array
from itertools import repeat

arr = array(typecode, [])
arr.extend(repeat(0, size_needed))

You can shorten this with the iterable version of the constructor:

arr = array(repeat(0, size_needed))

Or, to avoid the extra import:

arr = array(0 for _ in range(size_needed))

The reason that this is much less efficient is that array.extend (which is what the iterable constructor delegates to) does not use the length or __length_hint__ of the iterable object that is passed in. Consequently, the array will be expanded with multiple reallocations until you reach the target size. This may end up being almost as inefficient as allocating the list. While repeat does define a proper __length_hint__, generator expressions do not, but that is a moot point for the current CPython implementation.

Answered By: Mad Physicist
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.