How to fix columns not having matching element counts when using explode in python

Question:

I’m attempting to separate the string contents within certain columns to separate rows. The code I’m using results in an error that states the columns do not have matching element counts. How can I fix this?

Code:

review_path = r'data/base_data'
review_files = glob.glob(review_path + "/test_data.csv")

review_df_list = []
for review_file in review_files:
    df = pd.read_csv(review_file)
    print(df.head())
    df["business"] = (df["business"].str.extractall(r"(?:[s,]*)(.*?(?:Unspecified|employees|Self-employed))").groupby(level=0).agg(list))
    df["name"] = df["name"].str.split(r"s*,s*")
    print(df.explode(["name", "business"]))
    outPutPath = Path('data/base_data/test_data.csv')
    df.to_csv(outPutPath, index=False)

Error message:

Traceback (most recent call last):
  File "x", line 384, in <module>
    print(df.explode(["name", "business"]))
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pandas/core/frame.py", line 8255, in explode
    raise ValueError("columns must have matching element counts")
ValueError: columns must have matching element counts
Asked By: Sizzler

||

Answers:

This is because you have items of different lengths in one of your name, business series.

For instance, let’s look at business, and assume that the content is:

[[1,2],
 [1,2],
 [1,2,3],
 [1,2]]

The third row where you have an extra value (3) than "usual" will cause the error.

Answered By: Minions
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.