Explode raises values error ValueError: columns must have matching element counts

Question:

I have the following dataframe:

list1 = [1, 6, 7, [46, 56, 49], 45, [15, 10, 12]]
list2 = [[49, 57, 45], 3, 7, 8, [16, 19, 12], 41]

data = {'A':list1,
        'B': list2}
data = pd.DataFrame(data)

I can explode the dataframe using this piece of code:

data.explode('A').explode('B')

but when I run this one to do the same operation a value error is raised:

data.explode(['A', 'B'])


ValueError                                Traceback (most recent call last)
<ipython-input-97-efafc6c7cbfa> in <module>
      5         'B': list2}
      6 data = pd.DataFrame(data)
----> 7 data.explode(['A', 'B'])

~AppDataRoamingPythonPython38site-packagespandascoreframe.py in explode(self, column, ignore_index)
   9033             for c in columns[1:]:
   9034                 if not all(counts0 == self[c].apply(mylen)):
-> 9035                     raise ValueError("columns must have matching element counts")
   9036             result = DataFrame({c: df[c].explode() for c in columns})
   9037         result = df.drop(columns, axis=1).join(result)

ValueError: columns must have matching element counts

Can anyone explain why?

Asked By: ali bakhtiari

||

Answers:

df.explode(["A", "B"]) and df.explode("A").explode("B") do not do the same thing. It seems that you are aiming to get all the combinations where are the multi-column explode attempts to resolve a different scenario, one where you have paired lists in your columns. You can see the rationale in the original GitHub feature request. This seems to have been chosen to avoid duplicating values in one of the columns.

In the feature request there is a link to a GitHub gist/notebook that explores how explode could be implemented, but they seem to have not been able to explode with mis-matched list lengths in parallel.

Answered By: Alex

try this if it work in your case.

import numpy as np
data = pd.DataFrame({'A' : np.hstack(list1), 'B' : np.hstack(list2)})
Answered By: Akash Kumar
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.