List of lists of mixed types to numpy array
Question:
I have data imported from csv
and they are stored in list of lists as:
data=[['1', ' 1.013831', ' 1.713332', ' 1.327002', ' 3.674446', ' 19.995361', ' 09:44:24', ' 2.659884'], ['2', ' 1.013862', ' 1.713164', ' 1.326761', ' 3.662183', ' 19.996973', ' 09:49:27', ' 2.668791'], ['3', ' 1.013817', ' 1.712084', ' 1.326192', ' 3.658077', ' 19.997608', ' 09:54:27', ' 2.671786']]
I want to get a numpy
array so that I can actually use proper slicing (I don’t want pandas
or anything else, just plain old numpy
array with appropriate data types – not object).
So I tried the obvious:
arr=np.array(data,dtype='i4,f4,f4,f4,f4,f4,U8,f4')
only to get:
ValueError: invalid literal for int() with base 10: ' 1.013831'
This suggests that numpy
treats rows as columns and columns as rows. What to do? I also tried to input instead of data
list(map(tuple,data))
which gives and error that map object is not callable
and I tried:
arr=np.asarray(tuple(map(tuple,data)),dtype='i4,f4,f4,f4,f4,f4,U8,f4')
giving
ValueError: could not assign tuple of length 20 to structure with 8 fields.
Note the original number of rows in my case is 20.
So how do i get data from csv
into numpy
array where I want to specify what each column data type is?
Answers:
From the example in the documentation, this works
np.array(list(map(tuple, data)), dtype='i4,f4,f4,f4,f4,f4,U8,f4')
Output:
array([(1, 1.013831, 1.713332, 1.327002, 3.674446, 19.995361, ' 09:44:2', 2.659884),
(2, 1.013862, 1.713164, 1.326761, 3.662183, 19.996973, ' 09:49:2', 2.668791),
(3, 1.013817, 1.712084, 1.326192, 3.658077, 19.997608, ' 09:54:2', 2.671786)],
dtype=[('f0', '<i4'), ('f1', '<f4'), ('f2', '<f4'), ('f3', '<f4'), ('f4', '<f4'), ('f5', '<f4'), ('f6', '<U8'), ('f7', '<f4')])
f
1
I have data imported from csv
and they are stored in list of lists as:
data=[['1', ' 1.013831', ' 1.713332', ' 1.327002', ' 3.674446', ' 19.995361', ' 09:44:24', ' 2.659884'], ['2', ' 1.013862', ' 1.713164', ' 1.326761', ' 3.662183', ' 19.996973', ' 09:49:27', ' 2.668791'], ['3', ' 1.013817', ' 1.712084', ' 1.326192', ' 3.658077', ' 19.997608', ' 09:54:27', ' 2.671786']]
I want to get a numpy
array so that I can actually use proper slicing (I don’t want pandas
or anything else, just plain old numpy
array with appropriate data types – not object).
So I tried the obvious:
arr=np.array(data,dtype='i4,f4,f4,f4,f4,f4,U8,f4')
only to get:
ValueError: invalid literal for int() with base 10: ' 1.013831'
This suggests that numpy
treats rows as columns and columns as rows. What to do? I also tried to input instead of data
list(map(tuple,data))
which gives and error that map object is not callable
and I tried:
arr=np.asarray(tuple(map(tuple,data)),dtype='i4,f4,f4,f4,f4,f4,U8,f4')
giving
ValueError: could not assign tuple of length 20 to structure with 8 fields.
Note the original number of rows in my case is 20.
So how do i get data from csv
into numpy
array where I want to specify what each column data type is?
From the example in the documentation, this works
np.array(list(map(tuple, data)), dtype='i4,f4,f4,f4,f4,f4,U8,f4')
Output:
array([(1, 1.013831, 1.713332, 1.327002, 3.674446, 19.995361, ' 09:44:2', 2.659884),
(2, 1.013862, 1.713164, 1.326761, 3.662183, 19.996973, ' 09:49:2', 2.668791),
(3, 1.013817, 1.712084, 1.326192, 3.658077, 19.997608, ' 09:54:2', 2.671786)],
dtype=[('f0', '<i4'), ('f1', '<f4'), ('f2', '<f4'), ('f3', '<f4'), ('f4', '<f4'), ('f5', '<f4'), ('f6', '<U8'), ('f7', '<f4')])
f
1