Convert list into a pandas dataframe

Question:

I want to convert this list into a pandas dataframe

my_list = [1,2,3,4,5,6,7,8,9]

The dataframe would have 3 columns and 3 rows. I try using

df = pd.DataFrame(my_list, columns = list("abc"))

but it doesn’t seem to be working for me.

Asked By: Kay

||

Answers:

You need convert list to numpy array and then reshape:

df = pd.DataFrame(np.array(my_list).reshape(3,3), columns = list("abc"))
print (df)
   a  b  c
0  1  2  3
1  4  5  6
2  7  8  9
Answered By: jezrael

If you came here looking for a way to convert a Python list into a pandas DataFrame, you may be facing one of the following problems:

  1. The simplest method is to cast it to DataFrame object. The data types are inferred if you don’t specify it.
    df = pd.DataFrame(my_list)
    # or
    df = pd.DataFrame({'col1': my_list})
    

    res1


  1. If you have a nested list, again, DataFrame constructor works. Make sure that the number of column names are equal to the length of the longest sub-list.
    col_names = ['col1', 'col2']
    df = pd.DataFrame(my_list, columns=col_names)
    

    res2


  1. If you want to convert a flat list into a dataframe row, then convert it into a nested list first:
    df = pd.DataFrame([my_list])
    

    res3


  1. If you want to convert a nested list into a DataFrame where each sub-list is a DataFrame column, convert it into a dictionary and cast into a DataFrame. Make sure that the number of column names match the length of the list, i.e. len(col_names) == len(my_list) must be True.
    col_names = ['col1', 'col2', 'col3']
    df = pd.DataFrame(dict(zip(col_names, my_list)))
    

    res4


  1. If you want to convert a flat list into a multi-column DataFrame (as in the OP), one way is to transpose the list using iter() and zip() functions and cast to a DataFrame.
    col_names = ['col1', 'col2', 'col3']
    df = pd.DataFrame(zip(*[iter(my_list)]*len(col_names)), columns=col_names)
    

    res5


  1. If you want to convert a flat list into a multi-column DataFrame but consecutive values are column values (not row values as above), then you’ll have to transpose into a different shape.
    col_names = ['col1', 'col2', 'col3']
    df = pd.DataFrame(zip(*[iter(my_list)]*(len(my_list)//len(col_names))), index=col_names).T
    

    res6

Answered By: cottontail
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.