Zero pad numpy array

Question:

What’s the more pythonic way to pad an array with zeros at the end?

def pad(A, length):
    ...

A = np.array([1,2,3,4,5])
pad(A, 8)    # expected : [1,2,3,4,5,0,0,0]

In my real use case, in fact I want to pad an array to the closest multiple of 1024. Ex: 1342 => 2048, 3000 => 3072

Asked By: Basj

||

Answers:

This should work:

def pad(A, length):
    arr = np.zeros(length)
    arr[:len(A)] = A
    return arr

You might be able to get slightly better performance if you initialize an empty array (np.empty(length)) and then fill in A and the zeros separately, but I doubt that the speedups would be worth additional code complexity in most cases.

To get the value to pad up to, I think you’d probably just use something like divmod:

n, remainder = divmod(len(A), 1024)
n += bool(remainder)

Basically, this just figures out how many times 1024 divides the length of your array (and what the remainder of that division is). If there is no remainder, then you just want n * 1024 elements. If there is a remainder, then you want (n + 1) * 1024.

all-together:

def pad1024(A):
    n, remainder = divmod(len(A), 1024)
    n += bool(remainder)
    arr = np.zeros(n * 1024)
    arr[:len(A)] = A
    return arr        
Answered By: mgilson

You could also use numpy.pad:

>>> A = np.array([1,2,3,4,5])
>>> npad = 8 - len(A)
>>> np.pad(A, pad_width=npad, mode='constant', constant_values=0)[npad:]
array([1, 2, 3, 4, 5, 0, 0, 0])

And in a function:

def pad(A, npads):
    _npads = npads - len(A)
    return np.pad(A, pad_width=_npads, mode='constant', constant_values=0)[_npads:]
Answered By: Moses Koledoye

There’s np.pad:

A = np.array([1, 2, 3, 4, 5])
A = np.pad(A, (0, length), mode='constant')

Regarding your use case, the required number of zeros to pad can be calculated as length = len(A) + 1024 - 1024 % len(A).

Answered By: lballes

numpy.pad with constant mode does what you need, where we can pass a tuple as second argument to tell how many zeros to pad on each size, a (2, 3) for instance will pad 2 zeros on the left side and 3 zeros on the right side:

Given A as:

A = np.array([1,2,3,4,5])

np.pad(A, (2, 3), 'constant')
# array([0, 0, 1, 2, 3, 4, 5, 0, 0, 0])

It’s also possible to pad a 2D numpy arrays by passing a tuple of tuples as padding width, which takes the format of ((top, bottom), (left, right)):

A = np.array([[1,2],[3,4]])

np.pad(A, ((1,2),(2,1)), 'constant')

#array([[0, 0, 0, 0, 0],           # 1 zero padded to the top
#       [0, 0, 1, 2, 0],           # 2 zeros padded to the bottom
#       [0, 0, 3, 4, 0],           # 2 zeros padded to the left
#       [0, 0, 0, 0, 0],           # 1 zero padded to the right
#       [0, 0, 0, 0, 0]])

For your case, you specify the left side to be zero and right side pad calculated from a modular division:

B = np.pad(A, (0, 1024 - len(A)%1024), 'constant')
B
# array([1, 2, 3, ..., 0, 0, 0])
len(B)
# 1024

For a larger A:

A = np.ones(3000)
B = np.pad(A, (0, 1024 - len(A)%1024), 'constant')
B
# array([ 1.,  1.,  1., ...,  0.,  0.,  0.])

len(B)
# 3072
Answered By: Psidom

For future reference:

def padarray(A, size):
    t = size - len(A)
    return np.pad(A, pad_width=(0, t), mode='constant')

padarray([1,2,3], 8)     # [1 2 3 0 0 0 0 0]
Answered By: Basj

For your use case you can use resize() method:

A = np.array([1,2,3,4,5])
A.resize(8)

This resizes A in place. If there are refs to A numpy throws a vale error because the referenced value would be updated too. To allow this add refcheck=False option.

The documentation states that missing values will be 0:

Enlarging an array: as above, but missing entries are filled with zeros

Answered By: spinkus