Pandas merge giving error "Buffer has wrong number of dimensions (expected 1, got 2)"

Question:

I am trying to do a pandas merge and get the above error from the title when I try to run it. I am using 3 columns to match on whereas just before I do similar merge on only 2 columns and it works fine.

df = pd.merge(df, c, how="left",
        left_on=["section_term_ps_id", "section_school_id", "state"],
        right_on=["term_ps_id", "term_school_id", "state"])

columns for the two dataframes

df:

Index([u'section_ps_id', u'section_school_id', u'section_course_number', u'secti
on_term_ps_id', u'section_staff_ps_id', u'section_number', u'section_expression'
, u'section_grade_level', u'state', u'sections_id', u'course_ps_id', u'course_sc
hool_id', u'course_number', u'course_schd_dept', u'courses_id', u'school_ps_id',
 u'course_school_id', u'school_name', u'school_abbr', u'school_low_grade', u'sch
ool_high_grade', u'school_alt_school_number', u'school_state', u'school_phone',
u'school_fax', u'school_principal', u'school_principal_phone', u'school_principa
l_email', u'school_asst_principal', u'school_asst_principal_phone', u'school_ass
t_principal_email'], dtype='object')

c:

Index([u'term_ps_id', u'term_school_id', u'term_portion',
u'term_start_date', u' term_end_date', u'term_abbreviation',
u'term_name', u'state', u'terms_id', u'sch ool_ps_id',
u'term_school_id', u'school_name', u'school_abbr', u'school_low_grad
e', u'school_high_grade', u'school_alt_school_number',
u'school_state', u'school
_phone', u'school_fax', u'school_principal', u'school_principal_phone', u'school
_principal_email', u'school_asst_principal', u'school_asst_principal_phone', u's chool_asst_principal_email'],
dtype='object')

Is it possible to merge on three columns like this? Is there anything wrong from the merge call here?

Asked By: lathomas64

||

Answers:

As mentioned in the comments, you have a dupe column:

enter image description here

Answered By: JD Long

To adress the issue of the dupe columns you can either drop the dupe column using duplicated with smth. like:

c = c[~c.columns.duplicated(keep='first')]

or adding an additional char to either one of the DataFrames using for example:
c.columns=[c.columns[i]+str(i) for i in range(len(c.columns))]

Keep in mind that in this case you must adjust the merging part

Answered By: 2Obe

This Will remove the duplicated columns from the Dataframe

df = df[list(df.columns[~df.columns.duplicated()])]
Answered By: Shivpe_R

If there are no duplicate columns then:

Upgrade your pandas and make sure it’s a version above 1.1.0.
There’s some problem in broadcasting values in older versions of pandas. I had the same problem but it worked well in google colab and that’s how I found its a problem with older version because colab always uses the latest version of any library.

To upgrade pandas use:

pip install --upgrade pandas
Answered By: Aman Saini

I have faced similar issue, though the question is old but may help someone.
We have a python code using python library 0.25 and it works fine but when the code is imported to the pod with python library 1.3.2 it starts throwing below error:-

ERROR - Error in line 34 ValueError Buffer has wrong number of dimensions (expected 1, got 2)nTraceback (most recent call last)

Downgrading the version to 0.25 resolves the issue or upgrading the code resolves it.

Answered By: GoSharad123