Reading csv where one column is a list in python

Question:

I have the following lines inside a csv file

[0 1 2 3 4 5],2145004.491585603,5.784000000019773e-05
[0 1 2 3 4 5],4986045.063898375,1.771400000016854e-05
[0 1 2 3 4 5],2185254.9265346257,1.468399999993153e-05

As you can see, the first entry is a list of integers. How can I read in the data, so that I end up with a list (or numpy.array), and 2 floats? I tried to use np.genfromtxt but I didn’t know how to process the resulting bytes properly.

If there is no elegant solution, is there a better way to save the array inside one column?

Asked By: Sebastian Becker

||

Answers:

I noticed that the list is not an actual list (no commas between items), so this should work

In [1]: with open('data.csv') as f:
   ...:     reader = csv.reader(f)
   ...:     data = []
   ...:     for line in reader:
   ...:         lst_of_nums = [int(x) for x in line[0][1:-1].split()]
   ...:         data.append([lst_of_nums, float(line[1]), float(line[2])])
   ...:

In [2]: data
Out[2]:
[[[0, 1, 2, 3, 4, 5], 2145004.491585603, 5.784000000019773e-05],
 [[0, 1, 2, 3, 4, 5], 4986045.063898375, 1.771400000016854e-05],
 [[0, 1, 2, 3, 4, 5], 2185254.9265346257, 1.468399999993153e-05]]

If it was a valid list you could do

import csv
from ast import literal_eval

In [1]: with open('data.csv') as f:
   ...:     reader = csv.reader(f)
   ...:     data = []
   ...:     for line in reader:
   ...:         data.append([literal_eval(line[0]), float(line[1]), float(line[2])])
   ...:

In [2]: data
Out[2]:
[[[0, 1, 2, 3, 4, 5], 2145004.491585603, 5.784000000019773e-05],
 [[0, 1, 2, 3, 4, 5], 4986045.063898375, 1.771400000016854e-05],
 [[0, 1, 2, 3, 4, 5], 2185254.9265346257, 1.468399999993153e-05]]
Answered By: Ron Serruya

You don’t need a file simply try this trick. I know It’s ireelavent but sharing my way to read this.

Simply make data as str & read it.

import io
import pandas as pd


csv_data_as_str = '''[0 1 2 3 4 5],2145004.491585603,5.784000000019773e-05
[0 1 2 3 4 5],4986045.063898375,1.771400000016854e-05
[0 1 2 3 4 5],2185254.9265346257,1.468399999993153e-05 '''

df = pd.read_csv(io.StringIO(csv_data_as_str), sep=",",header = None)
    
print(df)

output

               0             1         2
0  [0 1 2 3 4 5]  2.145004e+06  0.000058
1  [0 1 2 3 4 5]  4.986045e+06  0.000018
2  [0 1 2 3 4 5]  2.185255e+06  0.000015
Answered By: Bhargav

If you have string of the form "[0 1 2 3 4 5]" and if the significant data are integers then:

lv = '[0 1 2 3 4 5]'
mylist = list(map(int, lv[1:-1].split()))
print(mylist)

Output:

[0, 1, 2, 3, 4, 5]
Answered By: OldBill
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.