Convert list into a pandas dataframe
Question:
I want to convert this list into a pandas dataframe
my_list = [1,2,3,4,5,6,7,8,9]
The dataframe would have 3 columns and 3 rows. I try using
df = pd.DataFrame(my_list, columns = list("abc"))
but it doesn’t seem to be working for me.
Answers:
You need convert list
to numpy array
and then reshape
:
df = pd.DataFrame(np.array(my_list).reshape(3,3), columns = list("abc"))
print (df)
a b c
0 1 2 3
1 4 5 6
2 7 8 9
If you came here looking for a way to convert a Python list into a pandas DataFrame, you may be facing one of the following problems:
- The simplest method is to cast it to DataFrame object. The data types are inferred if you don’t specify it.
df = pd.DataFrame(my_list)
# or
df = pd.DataFrame({'col1': my_list})
- If you have a nested list, again, DataFrame constructor works. Make sure that the number of column names are equal to the length of the longest sub-list.
col_names = ['col1', 'col2']
df = pd.DataFrame(my_list, columns=col_names)
- If you want to convert a flat list into a dataframe row, then convert it into a nested list first:
df = pd.DataFrame([my_list])
- If you want to convert a nested list into a DataFrame where each sub-list is a DataFrame column, convert it into a dictionary and cast into a DataFrame. Make sure that the number of column names match the length of the list, i.e.
len(col_names) == len(my_list)
must be True.
col_names = ['col1', 'col2', 'col3']
df = pd.DataFrame(dict(zip(col_names, my_list)))
- If you want to convert a flat list into a multi-column DataFrame (as in the OP), one way is to transpose the list using
iter()
and zip()
functions and cast to a DataFrame.
col_names = ['col1', 'col2', 'col3']
df = pd.DataFrame(zip(*[iter(my_list)]*len(col_names)), columns=col_names)
- If you want to convert a flat list into a multi-column DataFrame but consecutive values are column values (not row values as above), then you’ll have to transpose into a different shape.
col_names = ['col1', 'col2', 'col3']
df = pd.DataFrame(zip(*[iter(my_list)]*(len(my_list)//len(col_names))), index=col_names).T
I want to convert this list into a pandas dataframe
my_list = [1,2,3,4,5,6,7,8,9]
The dataframe would have 3 columns and 3 rows. I try using
df = pd.DataFrame(my_list, columns = list("abc"))
but it doesn’t seem to be working for me.
You need convert list
to numpy array
and then reshape
:
df = pd.DataFrame(np.array(my_list).reshape(3,3), columns = list("abc"))
print (df)
a b c
0 1 2 3
1 4 5 6
2 7 8 9
If you came here looking for a way to convert a Python list into a pandas DataFrame, you may be facing one of the following problems:
- The simplest method is to cast it to DataFrame object. The data types are inferred if you don’t specify it.
df = pd.DataFrame(my_list) # or df = pd.DataFrame({'col1': my_list})
- If you have a nested list, again, DataFrame constructor works. Make sure that the number of column names are equal to the length of the longest sub-list.
col_names = ['col1', 'col2'] df = pd.DataFrame(my_list, columns=col_names)
- If you want to convert a flat list into a dataframe row, then convert it into a nested list first:
df = pd.DataFrame([my_list])
- If you want to convert a nested list into a DataFrame where each sub-list is a DataFrame column, convert it into a dictionary and cast into a DataFrame. Make sure that the number of column names match the length of the list, i.e.
len(col_names) == len(my_list)
must be True.col_names = ['col1', 'col2', 'col3'] df = pd.DataFrame(dict(zip(col_names, my_list)))
- If you want to convert a flat list into a multi-column DataFrame (as in the OP), one way is to transpose the list using
iter()
andzip()
functions and cast to a DataFrame.col_names = ['col1', 'col2', 'col3'] df = pd.DataFrame(zip(*[iter(my_list)]*len(col_names)), columns=col_names)
- If you want to convert a flat list into a multi-column DataFrame but consecutive values are column values (not row values as above), then you’ll have to transpose into a different shape.
col_names = ['col1', 'col2', 'col3'] df = pd.DataFrame(zip(*[iter(my_list)]*(len(my_list)//len(col_names))), index=col_names).T