Concat List of DataFrames row-wise – throws pandas.errors.InvalidIndexError:

Question:

I am trying to concatenate a list df_l of ~200 Dataframes, which all have the same number of columns and names.

When I try to run:

df = pd.concat(df_l, axis=0)

it throws the error:

pandas.errors.InvalidIndexError: Reindexing only valid with uniquely
valued Index objects

Following this post I tried to reset the index of each dataframe, but I’ll still get the same error.

new_l = [df.reset_index(drop=True) for df in df_l]
pd.concat(new_l, axis=0)

Also pd.concatarguments like ignore_index=True did not help in any combination. Any advice?

Running on python 3.8 and pandas 1.4.2.

Asked By: Maeaex1

||

Answers:

I think there is problem with duplicated columns names, here is solution for deduplicate them with DataFrame.pipe:

#https://stackoverflow.com/a/44957247/2901002
def df_column_uniquify(df):
    df_columns = df.columns
    new_columns = []
    for item in df_columns:
        counter = 0
        newitem = item
        while newitem in new_columns:
            counter += 1
            newitem = "{}_{}".format(item, counter)
        new_columns.append(newitem)
    df.columns = new_columns
    return df

new_l = [df.pipe(df_column_uniquify).reset_index(drop=True) for df in df_l]
Answered By: jezrael