Merge multi-indexed with single-indexed data frames in pandas

Question:

I have two dataframes. df1 is multi-indexed:

                value
first second    
a     x         0.471780
      y         0.774908
      z         0.563634
b     x         -0.353756
      y         0.368062
      z         -1.721840

and df2:

      value
first   
a     10
b     20

How can I merge the two data frames with only one of the multi-indexes, in this case the ‘first’ index? The desired output would be:

                value1      value2
first second    
a     x         0.471780    10
      y         0.774908    10
      z         0.563634    10
b     x         -0.353756   20
      y         0.368062    20
      z         -1.721840   20
Asked By: user1642513

||

Answers:

You could use get_level_values:

firsts = df1.index.get_level_values('first')
df1['value2'] = df2.loc[firsts].values

Note: you are almost doing a join here (except the df1 is MultiIndex)… so there may be a neater way to describe this…

.

In an example (similar to what you have):

df1 = pd.DataFrame([['a', 'x', 0.123], ['a','x', 0.234],
                    ['a', 'y', 0.451], ['b', 'x', 0.453]],
                   columns=['first', 'second', 'value1']
                   ).set_index(['first', 'second'])
df2 = pd.DataFrame([['a', 10],['b', 20]],
                   columns=['first', 'value']).set_index(['first'])

firsts = df1.index.get_level_values('first')
df1['value2'] = df2.loc[firsts].values

In [5]: df1
Out[5]: 
              value1  value2
first second                
a     x        0.123      10
      x        0.234      10
      y        0.451      10
b     x        0.453      20
Answered By: Andy Hayden

As the .ix syntax is a powerful shortcut to reindexing, but in this case you are actually not doing any combined rows/column reindexing, this can be done a bit more elegantly (for my humble taste buds) with just using reindexing:

Preparation from hayden:

df1 = pd.DataFrame([['a', 'x', 0.123], ['a','x', 0.234],
                    ['a', 'y', 0.451], ['b', 'x', 0.453]],
                   columns=['first', 'second', 'value1']
                   ).set_index(['first', 'second'])
df2 = pd.DataFrame([['a', 10],['b', 20]],
                   columns=['first', 'value']).set_index(['first'])

Then this looks like this in iPython:

In [4]: df1
Out[4]: 
              value1
first second        
a     x        0.123
      x        0.234
      y        0.451
b     x        0.453

In [5]: df2
Out[5]: 
       value
first       
a         10
b         20

In [7]: df2.reindex(df1.index, level=0)
Out[7]: 
              value
first second       
a     x          10
      x          10
      y          10
b     x          20

In [8]: df1['value2'] = df2.reindex(df1.index, level=0)

In [9]: df1
Out[9]: 
              value1  value2
first second                
a     x        0.123      10
      x        0.234      10
      y        0.451      10
b     x        0.453      20

The mnemotechnic for what level you have to use in the reindex method:
It states for the level that you already covered in the bigger index.
So, in this case df2 already had level 0 covered of the df1.index.

Answered By: K.-Michael Aye

According to the documentation, as of pandas 0.14, you can simply join single-index and multiindex dataframes. It will match on the common index name. The how argument works as expected with 'inner' and 'outer', though interestingly it seems to be reversed for 'left' and 'right' (could this be a bug?).

df1 = pd.DataFrame([['a', 'x', 0.471780], ['a','y', 0.774908], ['a', 'z', 0.563634],
                    ['b', 'x', -0.353756], ['b', 'y', 0.368062], ['b', 'z', -1.721840],
                    ['c', 'x', 1], ['c', 'y', 2], ['c', 'z', 3],
                   ],
                   columns=['first', 'second', 'value1']
                   ).set_index(['first', 'second'])
df2 = pd.DataFrame([['a', 10], ['b', 20]],
                   columns=['first', 'value2']).set_index(['first'])

print(df1.join(df2, how='inner'))
                value1  value2
first second                  
a     x       0.471780      10
      y       0.774908      10
      z       0.563634      10
b     x      -0.353756      20
      y       0.368062      20
      z      -1.721840      20
Answered By: Matt M
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.