create a dictionary from a csv file that has no header, and the dictionary keys are a list given in the input
Question:
I’ve tried several different ways but none of them work.
For example
data = readData('energy_2.csv', ['M', 'V', 'H'])
Should return:
{'M': [150.270685, 150.062813, 150.090797, 150.050383, 150.065112, 149.968068, 149.915192, 150.060597, 149.798183, 150.074012, 150.052881, 149.9411, 150.01887, 149.924113, 149.906676], 'V': [12.977528, 12.595397, 13.489379, 13.802984, 12.841754, 12.651333, 13.346861, 11.646957, 11.92044, 12.43258, 12.695264, 12.583452, 12.592251, 12.903853, 12.53648], 'H': [75.638787, 75.329646, 74.502896, 74.24593, 74.056594, 75.484752, 74.883227, 76.901755, 75.238127, 76.996652, 74.006737, 75.1968, 73.863355, 75.000366, 76.025984]}
And mine returns:
{'M': ['150.270685;12.977528;75.638787', '150.062813;12.595397;75.329646', '150.090797;13.489379;74.502896', '150.050383;13.802984;74.24593', '150.065112;12.841754;74.056594', '149.968068;12.651333;75.484752', '149.915192;13.346861;74.883227', '150.060597;11.646957;76.901755', '149.798183;11.92044;75.238127', '150.074012;12.43258;76.996652', '150.052881;12.695264;74.006737', '149.9411;12.583452;75.1968', '150.01887;12.592251;73.863355', '149.924113;12.903853;75.000366', '149.906676;12.53648;76.025984']}
My code:
def readData(filename, labels):
import pandas as pd
df = pd.read_csv(filename, header=None)
return {k: list(v) for k, v in zip(labels, df.values.T)}
CSV FILE:
150.270685;12.977528;75.638787
150.062813;12.595397;75.329646
150.090797;13.489379;74.502896
150.050383;13.802984;74.24593
150.065112;12.841754;74.056594
149.968068;12.651333;75.484752
149.915192;13.346861;74.883227
150.060597;11.646957;76.901755
149.798183;11.92044;75.238127
150.074012;12.43258;76.996652
150.052881;12.695264;74.006737
149.9411;12.583452;75.1968
150.01887;12.592251;73.863355
149.924113;12.903853;75.000366
149.906676;12.53648;76.025984
Answers:
The issue is the default separator (sep=','
). Try setting sep=';'
instead of using the default. You can also set names
to the inputted list, labels
.
For example:
import pandas as pd
def readData(filename, labels):
df = pd.read_csv(filename, header=None, sep=";", names=labels)
return list(df['M'])
data = readData('energy_2.csv', ['M', 'V', 'H'])
print(data)
Output:
[150.270685, 150.062813, 150.090797, 150.050383, 150.065112, 149.968068, 149.915192, 150.060597, 149.798183, 150.074012, 150.052881, 149.9411, 150.01887, 149.924113, 149.906676]
Source: pandas.read_csv (docs)
Side Note: The answer above assumes energy_2.csv
looks similar to this:
150.270685;12.977528;75.638787
150.062813;12.595397;75.329646
150.090797;13.489379;74.502896
150.050383;13.802984;74.24593
150.065112;12.841754;74.056594
149.968068;12.651333;75.484752
149.915192;13.346861;74.883227
150.060597;11.646957;76.901755
149.798183;11.92044;75.238127
150.074012;12.43258;76.996652
150.052881;12.695264;74.006737
149.9411;12.583452;75.1968
150.01887;12.592251;73.863355
149.924113;12.903853;75.000366
149.906676;12.53648;76.025984
You don’t need pandas. Just use csv.reader
!
From your question, it appears your CSV file is separated by semicolons. You need to specify this, since the default separator is a comma
First, create a dictionary with the keys from labels
where the values are empty lists. Then, append the values in each row to the correct list. Since you want float
values, remember to convert them to float
before appending!
import csv
def readData(filename, labels):
data = {lbl: [] for lbl in labels}
with open(filename, "r") as f:
reader = csv.reader(f, delimiter=";")
for row in reader:
for lbl, value in zip(labels, row):
data[lbl].append(float(value))
return data
which gives you the required data
:
{'M': [150.270685,
150.062813,
150.090797,
150.050383,
150.065112,
149.968068,
149.915192,
150.060597,
149.798183,
150.074012,
150.052881,
149.9411,
150.01887,
149.924113,
149.906676],
'V': [12.977528,
12.595397,
13.489379,
13.802984,
12.841754,
12.651333,
13.346861,
11.646957,
11.92044,
12.43258,
12.695264,
12.583452,
12.592251,
12.903853,
12.53648],
'H': [75.638787,
75.329646,
74.502896,
74.24593,
74.056594,
75.484752,
74.883227,
76.901755,
75.238127,
76.996652,
74.006737,
75.1968,
73.863355,
75.000366,
76.025984]}
def readData(filename, labels):
import csv
with open(filename) as f:
data = list(csv.reader(f, delimiter = ';'))
return dict([[ labels[i], [d[i] for d in data if d]] for i in range(len(labels))])
headers = ['M', 'V', 'H']
print(readData('test.csv', headers))
# {'M': ['150.270685', '150.062813', '150.090797', '150.050383', '150.065112', '149.968068', '149.915192', '150.060597', '149.798183', '150.074012', '150.052881', '149.9411', '150.01887', '149.924113', '149.906676'], 'V': ['12.977528', '12.595397', '13.489379', '13.802984', '12.841754', '12.651333', '13.346861', '11.646957', '11.92044', '12.43258', '12.695264', '12.583452', '12.592251', '12.903853', '12.53648'], 'H': ['75.638787', '75.329646', '74.502896', '74.24593', '74.056594', '75.484752', '74.883227', '76.901755', '75.238127', '76.996652', '74.006737', '75.1968', '73.863355', '75.000366', '76.025984']}
I’ve tried several different ways but none of them work.
For example
data = readData('energy_2.csv', ['M', 'V', 'H'])
Should return:
{'M': [150.270685, 150.062813, 150.090797, 150.050383, 150.065112, 149.968068, 149.915192, 150.060597, 149.798183, 150.074012, 150.052881, 149.9411, 150.01887, 149.924113, 149.906676], 'V': [12.977528, 12.595397, 13.489379, 13.802984, 12.841754, 12.651333, 13.346861, 11.646957, 11.92044, 12.43258, 12.695264, 12.583452, 12.592251, 12.903853, 12.53648], 'H': [75.638787, 75.329646, 74.502896, 74.24593, 74.056594, 75.484752, 74.883227, 76.901755, 75.238127, 76.996652, 74.006737, 75.1968, 73.863355, 75.000366, 76.025984]}
And mine returns:
{'M': ['150.270685;12.977528;75.638787', '150.062813;12.595397;75.329646', '150.090797;13.489379;74.502896', '150.050383;13.802984;74.24593', '150.065112;12.841754;74.056594', '149.968068;12.651333;75.484752', '149.915192;13.346861;74.883227', '150.060597;11.646957;76.901755', '149.798183;11.92044;75.238127', '150.074012;12.43258;76.996652', '150.052881;12.695264;74.006737', '149.9411;12.583452;75.1968', '150.01887;12.592251;73.863355', '149.924113;12.903853;75.000366', '149.906676;12.53648;76.025984']}
My code:
def readData(filename, labels):
import pandas as pd
df = pd.read_csv(filename, header=None)
return {k: list(v) for k, v in zip(labels, df.values.T)}
CSV FILE:
150.270685;12.977528;75.638787
150.062813;12.595397;75.329646
150.090797;13.489379;74.502896
150.050383;13.802984;74.24593
150.065112;12.841754;74.056594
149.968068;12.651333;75.484752
149.915192;13.346861;74.883227
150.060597;11.646957;76.901755
149.798183;11.92044;75.238127
150.074012;12.43258;76.996652
150.052881;12.695264;74.006737
149.9411;12.583452;75.1968
150.01887;12.592251;73.863355
149.924113;12.903853;75.000366
149.906676;12.53648;76.025984
The issue is the default separator (sep=','
). Try setting sep=';'
instead of using the default. You can also set names
to the inputted list, labels
.
For example:
import pandas as pd
def readData(filename, labels):
df = pd.read_csv(filename, header=None, sep=";", names=labels)
return list(df['M'])
data = readData('energy_2.csv', ['M', 'V', 'H'])
print(data)
Output:
[150.270685, 150.062813, 150.090797, 150.050383, 150.065112, 149.968068, 149.915192, 150.060597, 149.798183, 150.074012, 150.052881, 149.9411, 150.01887, 149.924113, 149.906676]
Source: pandas.read_csv (docs)
Side Note: The answer above assumes energy_2.csv
looks similar to this:
150.270685;12.977528;75.638787
150.062813;12.595397;75.329646
150.090797;13.489379;74.502896
150.050383;13.802984;74.24593
150.065112;12.841754;74.056594
149.968068;12.651333;75.484752
149.915192;13.346861;74.883227
150.060597;11.646957;76.901755
149.798183;11.92044;75.238127
150.074012;12.43258;76.996652
150.052881;12.695264;74.006737
149.9411;12.583452;75.1968
150.01887;12.592251;73.863355
149.924113;12.903853;75.000366
149.906676;12.53648;76.025984
You don’t need pandas. Just use csv.reader
!
From your question, it appears your CSV file is separated by semicolons. You need to specify this, since the default separator is a comma
First, create a dictionary with the keys from labels
where the values are empty lists. Then, append the values in each row to the correct list. Since you want float
values, remember to convert them to float
before appending!
import csv
def readData(filename, labels):
data = {lbl: [] for lbl in labels}
with open(filename, "r") as f:
reader = csv.reader(f, delimiter=";")
for row in reader:
for lbl, value in zip(labels, row):
data[lbl].append(float(value))
return data
which gives you the required data
:
{'M': [150.270685,
150.062813,
150.090797,
150.050383,
150.065112,
149.968068,
149.915192,
150.060597,
149.798183,
150.074012,
150.052881,
149.9411,
150.01887,
149.924113,
149.906676],
'V': [12.977528,
12.595397,
13.489379,
13.802984,
12.841754,
12.651333,
13.346861,
11.646957,
11.92044,
12.43258,
12.695264,
12.583452,
12.592251,
12.903853,
12.53648],
'H': [75.638787,
75.329646,
74.502896,
74.24593,
74.056594,
75.484752,
74.883227,
76.901755,
75.238127,
76.996652,
74.006737,
75.1968,
73.863355,
75.000366,
76.025984]}
def readData(filename, labels):
import csv
with open(filename) as f:
data = list(csv.reader(f, delimiter = ';'))
return dict([[ labels[i], [d[i] for d in data if d]] for i in range(len(labels))])
headers = ['M', 'V', 'H']
print(readData('test.csv', headers))
# {'M': ['150.270685', '150.062813', '150.090797', '150.050383', '150.065112', '149.968068', '149.915192', '150.060597', '149.798183', '150.074012', '150.052881', '149.9411', '150.01887', '149.924113', '149.906676'], 'V': ['12.977528', '12.595397', '13.489379', '13.802984', '12.841754', '12.651333', '13.346861', '11.646957', '11.92044', '12.43258', '12.695264', '12.583452', '12.592251', '12.903853', '12.53648'], 'H': ['75.638787', '75.329646', '74.502896', '74.24593', '74.056594', '75.484752', '74.883227', '76.901755', '75.238127', '76.996652', '74.006737', '75.1968', '73.863355', '75.000366', '76.025984']}