Mean value of each element in multiple lists – Python
Question:
If I have two lists
a = [2,5,1,9]
b = [4,9,5,10]
How can I find the mean value of each element, so that the resultant list would be:
[3,7,3,9.5]
Answers:
>>> a = [2,5,1,9]
>>> b = [4,9,5,10]
>>> [(g + h) / 2 for g, h in zip(a, b)]
[3.0, 7.0, 3.0, 9.5]
What you want is the mean of two arrays (or vectors in math).
Since Python 3.4, there is a statistics module which provides a mean()
function:
statistics.mean(data)
Return the sample arithmetic mean of data, a sequence or iterator of real-valued numbers.
You can use it like this:
import statistics
a = [2, 5, 1, 9]
b = [4, 9, 5, 10]
result = [statistics.mean(k) for k in zip(a, b)]
# -> [3.0, 7.0, 3.0, 9.5]
notice: this solution can be use for more than two arrays, because zip()
can have multiple parameters.
An alternate to using a list and for loop would be to use a numpy array.
import numpy as np
# an array can perform element wise calculations unlike lists.
a, b = np.array([2,5,1,9]), np.array([4,9,5,10])
mean = (a + b)/2; print(mean)
>>>[ 3. 7. 3. 9.5]
Referring to your title of the question, you can achieve this simply with:
import numpy as np
multiple_lists = [[2,5,1,9], [4,9,5,10]]
arrays = [np.array(x) for x in multiple_lists]
[np.mean(k) for k in zip(*arrays)]
Above script will handle multiple lists not just two. If you want to compare the performance of two approaches try:
%%time
import random
import statistics
random.seed(33)
multiple_list = []
for seed in random.sample(range(100), 100):
random.seed(seed)
multiple_list.append(random.sample(range(100), 100))
result = [statistics.mean(k) for k in zip(*multiple_list)]
or alternatively:
%%time
import random
import numpy as np
random.seed(33)
multiple_list = []
for seed in random.sample(range(100), 100):
random.seed(seed)
multiple_list.append(np.array(random.sample(range(100), 100)))
result = [np.mean(k) for k in zip(*multiple_list)]
To my experience numpy approach is much faster.
Put the two lists into a numpy array using vstack and then take the mean (using ‘tolist’ to get back from the numpy array):
import numpy as np
a = [2,5,1,9]
b = [4,9,5,10]
np.mean(np.vstack([a,b]), axis=0).tolist()
[3.0, 7.0, 3.0, 9.5]
Seems you are looking for an element-wise mean value. setting axis=0 in np.mean is what you need.
Import numpy as np
a = [2,5,1,9]
b = [4,9,5,10]
Creat a list containing all your lists
a_b = [a,b]
a_b
[[2, 5, 1, 9], [4, 9, 5, 10]]
Use np.mean and set axis to 0
np.mean(a_b, axis=0)
array([3. , 7. , 3. , 9.5])
If I have two lists
a = [2,5,1,9]
b = [4,9,5,10]
How can I find the mean value of each element, so that the resultant list would be:
[3,7,3,9.5]
>>> a = [2,5,1,9]
>>> b = [4,9,5,10]
>>> [(g + h) / 2 for g, h in zip(a, b)]
[3.0, 7.0, 3.0, 9.5]
What you want is the mean of two arrays (or vectors in math).
Since Python 3.4, there is a statistics module which provides a mean()
function:
statistics.mean(data)
Return the sample arithmetic mean of data, a sequence or iterator of real-valued numbers.
You can use it like this:
import statistics
a = [2, 5, 1, 9]
b = [4, 9, 5, 10]
result = [statistics.mean(k) for k in zip(a, b)]
# -> [3.0, 7.0, 3.0, 9.5]
notice: this solution can be use for more than two arrays, because zip()
can have multiple parameters.
An alternate to using a list and for loop would be to use a numpy array.
import numpy as np
# an array can perform element wise calculations unlike lists.
a, b = np.array([2,5,1,9]), np.array([4,9,5,10])
mean = (a + b)/2; print(mean)
>>>[ 3. 7. 3. 9.5]
Referring to your title of the question, you can achieve this simply with:
import numpy as np
multiple_lists = [[2,5,1,9], [4,9,5,10]]
arrays = [np.array(x) for x in multiple_lists]
[np.mean(k) for k in zip(*arrays)]
Above script will handle multiple lists not just two. If you want to compare the performance of two approaches try:
%%time
import random
import statistics
random.seed(33)
multiple_list = []
for seed in random.sample(range(100), 100):
random.seed(seed)
multiple_list.append(random.sample(range(100), 100))
result = [statistics.mean(k) for k in zip(*multiple_list)]
or alternatively:
%%time
import random
import numpy as np
random.seed(33)
multiple_list = []
for seed in random.sample(range(100), 100):
random.seed(seed)
multiple_list.append(np.array(random.sample(range(100), 100)))
result = [np.mean(k) for k in zip(*multiple_list)]
To my experience numpy approach is much faster.
Put the two lists into a numpy array using vstack and then take the mean (using ‘tolist’ to get back from the numpy array):
import numpy as np
a = [2,5,1,9]
b = [4,9,5,10]
np.mean(np.vstack([a,b]), axis=0).tolist()
[3.0, 7.0, 3.0, 9.5]
Seems you are looking for an element-wise mean value. setting axis=0 in np.mean is what you need.
Import numpy as np
a = [2,5,1,9]
b = [4,9,5,10]
Creat a list containing all your lists
a_b = [a,b]
a_b
[[2, 5, 1, 9], [4, 9, 5, 10]]
Use np.mean and set axis to 0
np.mean(a_b, axis=0)
array([3. , 7. , 3. , 9.5])