Finding the max and min in a tuple of tuples
Question:
I’m new to python and having some problems finding the minimum and maximum values for a tuple of tuples. I need them to normalise my data. So, basically, I have a list that is a row of 13 numbers, each representing something. Each number makes a column in a list, and I need the max
and min
for each column. I tried indexing/iterating through but keep getting an error of
max_j = max(j)
TypeError: 'float' object is not iterable
any help would be appreciated!
The code is (assuming data_set_tup is a tuple of tuples, eg ((1,3,4,5,6,7,…),(5,6,7,3,6,73,2…)…(3,4,5,6,3,2,2…)) I also want to make a new list using the normalised values.
normal_list = []
for i in data_set_tup:
for j in i[1:]: # first column doesn't need to be normalised
max_j = max(j)
min_j = min(j)
normal_j = (j-min_j)/(max_j-min_j)
normal_list.append(normal_j)
normal_tup = tuple(normal_list)
Answers:
You can transpose rows to columns and vice versa with zip(*...)
. (Use list(zip(*...))
in Python 3)
cols = zip(*data_set_tup)
normal_cols = [cols[0]] # first column doesn't need to be normalised
for j in cols[1:]:
max_j = max(j)
min_j = min(j)
normal_cols.append(tuple((k-min_j)/(max_j-min_j) for k in j)
normal_list = zip(*normal_cols)
This really sounds like a job for the non-builtin numpy module, or maybe the pandas module, depending on your needs.
Adding an extra dependency on your application should not be done lightly, but if you do a lot of work on matrix-like data, then your code will likely be both faster and more readable if you use one of the above modules throughout your application.
I do not recommend converting a list of lists to a numpy array and back again just to get this single result — it’s better to use the pure python method of Jannes answer. Also, seeing that you’re a python beginner, numpy may be overkill right now. But I think your question deserves an answer pointing out that this is an option.
Here’s a step-by-step console illustration of how this would work in numpy:
>>> import numpy as np
>>> a = np.array([[1,3,4,5,6],[5,6,7,3,6],[3,4,5,6,3]], dtype=float)
>>> a
array([[ 1., 3., 4., 5., 6.],
[ 5., 6., 7., 3., 6.],
[ 3., 4., 5., 6., 3.]])
>>> min = np.min(a, axis=0)
>>> min
array([1, 3, 4, 3, 3])
>>> max = np.max(a, axis=0)
>>> max
array([5, 6, 7, 6, 6])
>>> normalized = (a - min) / (max - min)
>>> normalized
array([[ 0. , 0. , 0. , 0.66666667, 1. ],
[ 1. , 1. , 1. , 0. , 1. ],
[ 0.5 , 0.33333333, 0.33333333, 1. , 0. ]])
So in actual code:
import numpy as np
def normalize_by_column(a):
min = np.min(a, axis=0)
max = np.max(a, axis=0)
return (a - min) / (max - min)
We have nested_tuple = ((1, 2, 3), (4, 5, 6), (7, 8, 9))
.
First of all we need to normalize it. Pythonic way:
flat_tuple = [x for row in nested_tuple for x in row]
Output: [1, 2, 3, 4, 5, 6, 7, 8, 9] # it's a list
Move it to tuple: tuple(flat_tuple)
, get max value: max(flat_tuple)
, get min value: min(flat_tuple)
I’m new to python and having some problems finding the minimum and maximum values for a tuple of tuples. I need them to normalise my data. So, basically, I have a list that is a row of 13 numbers, each representing something. Each number makes a column in a list, and I need the max
and min
for each column. I tried indexing/iterating through but keep getting an error of
max_j = max(j)
TypeError: 'float' object is not iterable
any help would be appreciated!
The code is (assuming data_set_tup is a tuple of tuples, eg ((1,3,4,5,6,7,…),(5,6,7,3,6,73,2…)…(3,4,5,6,3,2,2…)) I also want to make a new list using the normalised values.
normal_list = []
for i in data_set_tup:
for j in i[1:]: # first column doesn't need to be normalised
max_j = max(j)
min_j = min(j)
normal_j = (j-min_j)/(max_j-min_j)
normal_list.append(normal_j)
normal_tup = tuple(normal_list)
You can transpose rows to columns and vice versa with zip(*...)
. (Use list(zip(*...))
in Python 3)
cols = zip(*data_set_tup)
normal_cols = [cols[0]] # first column doesn't need to be normalised
for j in cols[1:]:
max_j = max(j)
min_j = min(j)
normal_cols.append(tuple((k-min_j)/(max_j-min_j) for k in j)
normal_list = zip(*normal_cols)
This really sounds like a job for the non-builtin numpy module, or maybe the pandas module, depending on your needs.
Adding an extra dependency on your application should not be done lightly, but if you do a lot of work on matrix-like data, then your code will likely be both faster and more readable if you use one of the above modules throughout your application.
I do not recommend converting a list of lists to a numpy array and back again just to get this single result — it’s better to use the pure python method of Jannes answer. Also, seeing that you’re a python beginner, numpy may be overkill right now. But I think your question deserves an answer pointing out that this is an option.
Here’s a step-by-step console illustration of how this would work in numpy:
>>> import numpy as np
>>> a = np.array([[1,3,4,5,6],[5,6,7,3,6],[3,4,5,6,3]], dtype=float)
>>> a
array([[ 1., 3., 4., 5., 6.],
[ 5., 6., 7., 3., 6.],
[ 3., 4., 5., 6., 3.]])
>>> min = np.min(a, axis=0)
>>> min
array([1, 3, 4, 3, 3])
>>> max = np.max(a, axis=0)
>>> max
array([5, 6, 7, 6, 6])
>>> normalized = (a - min) / (max - min)
>>> normalized
array([[ 0. , 0. , 0. , 0.66666667, 1. ],
[ 1. , 1. , 1. , 0. , 1. ],
[ 0.5 , 0.33333333, 0.33333333, 1. , 0. ]])
So in actual code:
import numpy as np
def normalize_by_column(a):
min = np.min(a, axis=0)
max = np.max(a, axis=0)
return (a - min) / (max - min)
We have nested_tuple = ((1, 2, 3), (4, 5, 6), (7, 8, 9))
.
First of all we need to normalize it. Pythonic way:
flat_tuple = [x for row in nested_tuple for x in row]
Output: [1, 2, 3, 4, 5, 6, 7, 8, 9] # it's a list
Move it to tuple: tuple(flat_tuple)
, get max value: max(flat_tuple)
, get min value: min(flat_tuple)