pandas – expand array to columns

Question:

I have a column in my pandas dataframe that contains array of numbers:

index | col

0 |     [106.43477116337492, 6.762679391732501, 0.0, 9...
 
1 |      [106.43477116337492, 6.58742122158056, 0.0, 9....

2 |    [106.22211427793361, 7.303693743071101, 0.0, 9...

3 |     [106.43477116337492, 7.955196940809838, 0.0, 9...

4 |     [106.43477116337492, 6.400733170766536, 0.0, 9...

One value:

array([106.43477116,   6.76267939,   0.        ,   9.26076567,
        10.78086689, 106.63684122,   5.98865461,   0.        ,
         8.16789259,   9.94066589,   2.03606668,   0.        ,
         0.        ])

I need to expand the values in the array to separate columns so I will have:

col1 | col2 | col3 ...

106.434... | 6.7526.... | 0.0 ...

106.434... | 6.5874.... | 0.0 ...

How to do this? I already spent quite some time researching on this but only thing I found is explode() which is not what I want.

Asked By: romanzdk

||

Answers:

Maybe this will help

a = np.array([106.43477116, 6.76267939, 0. , 9.26076567, 10.78086689, 106.63684122, 5.98865461, 0. , 8.16789259, 9.94066589, 2.03606668, 0. , 0. ])
col = pd.Series([a,a,a])
arr = np.array(col.values.tolist())
df = pd.DataFrame(columns=['c'+str(i) for i in range(a.size)])
df[df.columns] = arr
print(df)

Output:

           c0        c1   c2        c3         c4          c5        c6   c7  
0  106.434771  6.762679  0.0  9.260766  10.780867  106.636841  5.988655  0.0   
1  106.434771  6.762679  0.0  9.260766  10.780867  106.636841  5.988655  0.0   
2  106.434771  6.762679  0.0  9.260766  10.780867  106.636841  5.988655  0.0   

         c8        c9       c10  c11  c12  
0  8.167893  9.940666  2.036067  0.0  0.0  
1  8.167893  9.940666  2.036067  0.0  0.0  
2  8.167893  9.940666  2.036067  0.0  0.0  

I’m effectively just turning your column into np.ndarray and assigning it to df[df.columns].
.values.tolist() part is essential to get strictly shaped array. Maybe it’s not the best way of doing it

Answered By: bottledmind

You can ‘spread’ the column with arrays values using to_list, then rebuild a dataframe, with if needed a prefix. And (eventually) get rid of the original column.

Assuming your dataframe column with arrays values is named 'array':

dfs = (df.join(pd.DataFrame(df['array'].to_list())
                 .add_prefix('array_'))
         .drop('array', axis = 1))

>>> print(dfs)
      array_0   array_1  array_2  ...  array_10  array_11  array_12
0  106.434771  6.762679      0.0  ...  2.036067       0.0       0.0
1  106.434771  6.762679      0.0  ...  2.036067       0.0       0.0

[2 rows x 13 columns]

If you have a single column, do not want prefixes, and do not want to keep the original column, it is a bit simpler:

dfs = pd.DataFrame(df.iloc[:,0].to_list())

>>> print(dfs)
            0         1    2         3  ...         9        10   11   12
0  106.434771  6.762679  0.0  9.260766  ...  9.940666  2.036067  0.0  0.0
1  106.434771  6.762679  0.0  9.260766  ...  9.940666  2.036067  0.0  0.0

[2 rows x 13 columns]
Answered By: hpchavaz
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.