Numpy Zero Padding to match a certain shape
Question:
I have a file with arrays or different shapes. I want to zeropad all the array to match the largest shape. The largest shape is (93,13).
To test this I have the following code:
testarray = np.ones((41,13))
how can I zero pad this array to match the shape of (93,13)? And ultimately, how can I do it for thousands of rows?
Edit: The solution was found in the comments:
for index, array in enumerate(mfcc):
testarray = np.zeros((93,13))
for index,row in enumerate(array):
for i in range(0,len(row)-1):
testarray[index][i]= row[i]
mfcc[index] = testarray
Answers:
If you want to pad to the right and to the bottom of your original array in 2D, here’s what you want:
import numpy as np
a = np.ones((41,11))
desired_rows = 91
desired_cols = 13
b = np.pad(a, ((0, desired_rows-a.shape[0]), (0, desired_cols-a.shape[1])), 'constant', constant_values=0)
print(b)
"""
prints
[[1. 1. 1. ... 1. 0. 0.]
[1. 1. 1. ... 1. 0. 0.]
[1. 1. 1. ... 1. 0. 0.]
...
[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]]
"""
Of course it’s not error-proof solution, e.g. if your desired number of rows or columns is smaller than corresponding size of the original array, you’ll get ValueError: index can't contain negative values
.
You could do like this. array
is your original array and in this case just for testcase. Just use your own one.
import numpy as np
array = [[None] * 10]*10
#print(array)
testarray = np.zeros((93,13))
for index,row in enumerate(array):
for i in range(0,len(row)-1):
testarray[index][i]= row[i]
Here’s an approach using np.pad
that can generalize to an arbitrary target shape:
def to_shape(a, shape):
y_, x_ = shape
y, x = a.shape
y_pad = (y_-y)
x_pad = (x_-x)
return np.pad(a,((y_pad//2, y_pad//2 + y_pad%2),
(x_pad//2, x_pad//2 + x_pad%2)),
mode = 'constant')
For the proposed example:
a = np.ones((41,13))
shape = [93, 13]
to_shape(a, shape).shape
# (93, 13)
Lets check with another example:
shape = [100, 121]
to_shape(a, shape).shape
# (100, 121)
Timings
def florian(array, shape):
#print(array)
testarray = np.zeros(shape)
for index,row in enumerate(array):
for i in range(0,len(row)-1):
testarray[index][i]= row[i]
def to_shape(a, shape):
y_, x_ = shape
y, x = a.shape
y_pad = (y_-y)
x_pad = (x_-x)
return np.pad(a,((y_pad//2, y_pad//2 + y_pad%2),
(x_pad//2, x_pad//2 + x_pad%2)),
mode = 'constant')
a = np.ones((500, 500))
shape = [1000, 1103]
%timeit florian(a, shape)
# 101 ms ± 5.12 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit to_shape(a, shape)
# 19.8 ms ± 318 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
To make the answer from @yatu a bit more generic. I wrote the following code
def to_shape(x, target_shape):
padding_list = []
for x_dim, target_dim in zip(x.shape, target_shape):
pad_value = (target_dim - x_dim)
pad_tuple = ((pad_value//2, pad_value//2 + pad_value%2))
padding_list.append(pad_tuple)
return np.pad(x, tuple(padding_list), mode='constant')
Did not perform any timings but did a sanity check if it did what it needed to do. Note that this assume that the target shape is larger in ANY dimension compared to the input array x.
Simpler solution by creating zeros array and then filling in with slice notation.
def to_shape(a, shape):
z = np.zeros(shape)
z[:a.shape[0], :a.shape[1]] = a
return z
I have a file with arrays or different shapes. I want to zeropad all the array to match the largest shape. The largest shape is (93,13).
To test this I have the following code:
testarray = np.ones((41,13))
how can I zero pad this array to match the shape of (93,13)? And ultimately, how can I do it for thousands of rows?
Edit: The solution was found in the comments:
for index, array in enumerate(mfcc):
testarray = np.zeros((93,13))
for index,row in enumerate(array):
for i in range(0,len(row)-1):
testarray[index][i]= row[i]
mfcc[index] = testarray
If you want to pad to the right and to the bottom of your original array in 2D, here’s what you want:
import numpy as np
a = np.ones((41,11))
desired_rows = 91
desired_cols = 13
b = np.pad(a, ((0, desired_rows-a.shape[0]), (0, desired_cols-a.shape[1])), 'constant', constant_values=0)
print(b)
"""
prints
[[1. 1. 1. ... 1. 0. 0.]
[1. 1. 1. ... 1. 0. 0.]
[1. 1. 1. ... 1. 0. 0.]
...
[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]]
"""
Of course it’s not error-proof solution, e.g. if your desired number of rows or columns is smaller than corresponding size of the original array, you’ll get ValueError: index can't contain negative values
.
You could do like this. array
is your original array and in this case just for testcase. Just use your own one.
import numpy as np
array = [[None] * 10]*10
#print(array)
testarray = np.zeros((93,13))
for index,row in enumerate(array):
for i in range(0,len(row)-1):
testarray[index][i]= row[i]
Here’s an approach using np.pad
that can generalize to an arbitrary target shape:
def to_shape(a, shape):
y_, x_ = shape
y, x = a.shape
y_pad = (y_-y)
x_pad = (x_-x)
return np.pad(a,((y_pad//2, y_pad//2 + y_pad%2),
(x_pad//2, x_pad//2 + x_pad%2)),
mode = 'constant')
For the proposed example:
a = np.ones((41,13))
shape = [93, 13]
to_shape(a, shape).shape
# (93, 13)
Lets check with another example:
shape = [100, 121]
to_shape(a, shape).shape
# (100, 121)
Timings
def florian(array, shape):
#print(array)
testarray = np.zeros(shape)
for index,row in enumerate(array):
for i in range(0,len(row)-1):
testarray[index][i]= row[i]
def to_shape(a, shape):
y_, x_ = shape
y, x = a.shape
y_pad = (y_-y)
x_pad = (x_-x)
return np.pad(a,((y_pad//2, y_pad//2 + y_pad%2),
(x_pad//2, x_pad//2 + x_pad%2)),
mode = 'constant')
a = np.ones((500, 500))
shape = [1000, 1103]
%timeit florian(a, shape)
# 101 ms ± 5.12 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit to_shape(a, shape)
# 19.8 ms ± 318 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
To make the answer from @yatu a bit more generic. I wrote the following code
def to_shape(x, target_shape):
padding_list = []
for x_dim, target_dim in zip(x.shape, target_shape):
pad_value = (target_dim - x_dim)
pad_tuple = ((pad_value//2, pad_value//2 + pad_value%2))
padding_list.append(pad_tuple)
return np.pad(x, tuple(padding_list), mode='constant')
Did not perform any timings but did a sanity check if it did what it needed to do. Note that this assume that the target shape is larger in ANY dimension compared to the input array x.
Simpler solution by creating zeros array and then filling in with slice notation.
def to_shape(a, shape):
z = np.zeros(shape)
z[:a.shape[0], :a.shape[1]] = a
return z