Unwrap numpy array stored in single cell in dataframe to rows

Question

I have pandas dataframe where I have stored numpy 1D arrays in single cells, so the full array is only occupying one cell. There are also other columns with single values, although I don’t think that should matter.

My question is how I, somewhat efficiently, can unravel/unwrap the arrays and put them into rows? I have several columns that I would like to unwrap like this.

I can access the individual numbers by using i as index

df['column1'].iloc[0][i]

but there must be some smarter way than looping through it all and inserting the values individually to unwrap all the values.

The dataframe looks as follows. Some of the arrays are horizontal and some are vertical.

    column1            column2           column3
0   [0.012, 0.07, ...] [1.23, 1.92, ...] [132, 542, ...]

The desired output is

   column1 column2 column3
0  0.012   1.23    132
1  0.07    1.92    542
2  ...     ...     ...

Edit:
After using Ian’s solution I get the following output where all the numbers have been put into rows but the "formatting" from the numpy array remains. How can I avoid that?

   column1 column2 column3
0  [0.012]   [1.23]    [132]
1  [0.07]    [1.92]    [542]
2  ...     ...     ...

Edit2: This is how the data looks in the original df. The reason is that its a vertical numpy array. If I do a .shape on one of the columns it returns (1,).

   column1              column2              column3
0  [[0.012]n [0.07]]   [[1.23]n [1.92]]    [[132]n [542]]

Asked By: Thomas Hemming Larsen

||

Source

Answer 1

All Arrays have same `len`

# python 3.10.6
import numpy as np  # 1.23.4
import pandas as pd  # 1.5.1

# setup
df = pd.DataFrame.from_records(
    data=np.random.random((1, 3, 2)).round(3),
).add_prefix("column")

print(df)

          column0         column1         column2
0  [0.271, 0.544]  [0.579, 0.329]  [0.732, 0.305]

out = pd.concat([df[col].explode(ignore_index=True) for col in df],
                axis="columns")

print(out)

  column0 column1 column2
0   0.271   0.579   0.732
1   0.544   0.329   0.305

Some Arrays have different `len`

# setup
df = pd.concat([pd.DataFrame.from_records(np.random.random((1, 1, n)).round(3),
                                          columns=[f"column{c}"]) for c, n in enumerate(range(2, 5))],
               axis="columns")

print(df)

          column0                column1                       column2
0  [0.111, 0.691]  [0.215, 0.981, 0.605]  [0.696, 0.121, 0.531, 0.835]

out = pd.concat([df[col].explode(ignore_index=True) for col in df],
                axis="columns")

print(out)

  column0 column1 column2
0   0.111   0.215   0.696
1   0.691   0.981   0.121
2     NaN   0.605   0.531
3     NaN     NaN   0.835

References

Answered By: Ian Thompson

Unwrap numpy array stored in single cell in dataframe to rows

Question:

Answers:

All Arrays have same `len`

Some Arrays have different `len`

References

Unwrap numpy array stored in single cell in dataframe to rows

Question:

Answers:

All Arrays have same len

Some Arrays have different len

References

All Arrays have same `len`

Some Arrays have different `len`