How to normalize a 2-dimensional numpy array in python less verbose?
Question:
Given a 3 times 3 numpy array
a = numpy.arange(0,27,3).reshape(3,3)
# array([[ 0, 3, 6],
# [ 9, 12, 15],
# [18, 21, 24]])
To normalize the rows of the 2-dimensional array I thought of
row_sums = a.sum(axis=1) # array([ 9, 36, 63])
new_matrix = numpy.zeros((3,3))
for i, (row, row_sum) in enumerate(zip(a, row_sums)):
new_matrix[i,:] = row / row_sum
There must be a better way, isn’t there?
Perhaps to clearify: By normalizing I mean, the sum of the entrys per row must be one. But I think that will be clear to most people.
Answers:
Broadcasting is really good for this:
row_sums = a.sum(axis=1)
new_matrix = a / row_sums[:, numpy.newaxis]
row_sums[:, numpy.newaxis]
reshapes row_sums from being (3,)
to being (3, 1)
. When you do a / b
, a
and b
are broadcast against each other.
You can learn more about broadcasting here or even better here.
I think this should work,
a = numpy.arange(0,27.,3).reshape(3,3)
a /= a.sum(axis=1)[:,numpy.newaxis]
Scikit-learn offers a function normalize()
that lets you apply various normalizations. The "make it sum to 1" is called L1-norm. Therefore:
from sklearn.preprocessing import normalize
matrix = numpy.arange(0,27,3).reshape(3,3).astype(numpy.float64)
# array([[ 0., 3., 6.],
# [ 9., 12., 15.],
# [ 18., 21., 24.]])
normed_matrix = normalize(matrix, axis=1, norm='l1')
# [[ 0. 0.33333333 0.66666667]
# [ 0.25 0.33333333 0.41666667]
# [ 0.28571429 0.33333333 0.38095238]]
Now your rows will sum to 1.
In case you are trying to normalize each row such that its magnitude is one (i.e. a row’s unit length is one or the sum of the square of each element in a row is one):
import numpy as np
a = np.arange(0,27,3).reshape(3,3)
result = a / np.linalg.norm(a, axis=-1)[:, np.newaxis]
# array([[ 0. , 0.4472136 , 0.89442719],
# [ 0.42426407, 0.56568542, 0.70710678],
# [ 0.49153915, 0.57346234, 0.65538554]])
Verifying:
np.sum( result**2, axis=-1 )
# array([ 1., 1., 1.])
it appears that this also works
def normalizeRows(M):
row_sums = M.sum(axis=1)
return M / row_sums
Or using lambda function, like
>>> vec = np.arange(0,27,3).reshape(3,3)
>>> import numpy as np
>>> norm_vec = map(lambda row: row/np.linalg.norm(row), vec)
each vector of vec will have a unit norm.
You could also use matrix transposition:
(a.T / row_sums).T
I think you can normalize the row elements sum to 1 by this:
new_matrix = a / a.sum(axis=1, keepdims=1)
.
And the column normalization can be done with new_matrix = a / a.sum(axis=0, keepdims=1)
. Hope this can hep.
You could use built-in numpy function:
np.linalg.norm(a, axis = 1, keepdims = True)
Here is one more possible way using reshape
:
a_norm = (a/a.sum(axis=1).reshape(-1,1)).round(3)
print(a_norm)
Or using None
works too:
a_norm = (a/a.sum(axis=1)[:,None]).round(3)
print(a_norm)
Output:
array([[0. , 0.333, 0.667],
[0.25 , 0.333, 0.417],
[0.286, 0.333, 0.381]])
We can achieve the same effect by premultiplying with the diagonal matrix whose main diagonal is the reciprocal of the row sums.
A = np.diag(A.sum(1)**-1) @ A
Use
a = a / np.linalg.norm(a, ord = 2, axis = 0, keepdims = True)
Due to the broadcasting, it will work as intended.
Given a 3 times 3 numpy array
a = numpy.arange(0,27,3).reshape(3,3)
# array([[ 0, 3, 6],
# [ 9, 12, 15],
# [18, 21, 24]])
To normalize the rows of the 2-dimensional array I thought of
row_sums = a.sum(axis=1) # array([ 9, 36, 63])
new_matrix = numpy.zeros((3,3))
for i, (row, row_sum) in enumerate(zip(a, row_sums)):
new_matrix[i,:] = row / row_sum
There must be a better way, isn’t there?
Perhaps to clearify: By normalizing I mean, the sum of the entrys per row must be one. But I think that will be clear to most people.
Broadcasting is really good for this:
row_sums = a.sum(axis=1)
new_matrix = a / row_sums[:, numpy.newaxis]
row_sums[:, numpy.newaxis]
reshapes row_sums from being (3,)
to being (3, 1)
. When you do a / b
, a
and b
are broadcast against each other.
You can learn more about broadcasting here or even better here.
I think this should work,
a = numpy.arange(0,27.,3).reshape(3,3)
a /= a.sum(axis=1)[:,numpy.newaxis]
Scikit-learn offers a function normalize()
that lets you apply various normalizations. The "make it sum to 1" is called L1-norm. Therefore:
from sklearn.preprocessing import normalize
matrix = numpy.arange(0,27,3).reshape(3,3).astype(numpy.float64)
# array([[ 0., 3., 6.],
# [ 9., 12., 15.],
# [ 18., 21., 24.]])
normed_matrix = normalize(matrix, axis=1, norm='l1')
# [[ 0. 0.33333333 0.66666667]
# [ 0.25 0.33333333 0.41666667]
# [ 0.28571429 0.33333333 0.38095238]]
Now your rows will sum to 1.
In case you are trying to normalize each row such that its magnitude is one (i.e. a row’s unit length is one or the sum of the square of each element in a row is one):
import numpy as np
a = np.arange(0,27,3).reshape(3,3)
result = a / np.linalg.norm(a, axis=-1)[:, np.newaxis]
# array([[ 0. , 0.4472136 , 0.89442719],
# [ 0.42426407, 0.56568542, 0.70710678],
# [ 0.49153915, 0.57346234, 0.65538554]])
Verifying:
np.sum( result**2, axis=-1 )
# array([ 1., 1., 1.])
it appears that this also works
def normalizeRows(M):
row_sums = M.sum(axis=1)
return M / row_sums
Or using lambda function, like
>>> vec = np.arange(0,27,3).reshape(3,3)
>>> import numpy as np
>>> norm_vec = map(lambda row: row/np.linalg.norm(row), vec)
each vector of vec will have a unit norm.
You could also use matrix transposition:
(a.T / row_sums).T
I think you can normalize the row elements sum to 1 by this:
new_matrix = a / a.sum(axis=1, keepdims=1)
.
And the column normalization can be done with new_matrix = a / a.sum(axis=0, keepdims=1)
. Hope this can hep.
You could use built-in numpy function:
np.linalg.norm(a, axis = 1, keepdims = True)
Here is one more possible way using reshape
:
a_norm = (a/a.sum(axis=1).reshape(-1,1)).round(3)
print(a_norm)
Or using None
works too:
a_norm = (a/a.sum(axis=1)[:,None]).round(3)
print(a_norm)
Output:
array([[0. , 0.333, 0.667],
[0.25 , 0.333, 0.417],
[0.286, 0.333, 0.381]])
We can achieve the same effect by premultiplying with the diagonal matrix whose main diagonal is the reciprocal of the row sums.
A = np.diag(A.sum(1)**-1) @ A
Use
a = a / np.linalg.norm(a, ord = 2, axis = 0, keepdims = True)
Due to the broadcasting, it will work as intended.