pandas DataFrame diagonal

Question:

What is an efficient way to get the diagonal of a square DataFrame. I would expect the result to be a Series with a MultiIndex with two levels, the first being the index of the DataFrame the second level being the columns of the DataFrame.

Setup

import pandas as pd
import numpy as np

np.random.seed([3, 1415])
df = pd.DataFrame(np.random.rand(3, 3) * 5,
                  columns = list('abc'),
                  index = list('ABC'),
                  dtype=np.int64
                 )

I want to see this:

print df.stack().loc[[('A', 'a'), ('B', 'b'), ('C', 'c')]]

A  a    2
B  b    2
C  c    3
Asked By: piRSquared

||

Answers:

You could do something like this:

In [16]:
midx = pd.MultiIndex.from_tuples(list(zip(df.index,df.columns)))
pd.DataFrame(data=np.diag(df), index=midx)

Out[16]:
     0
A a  2
B b  2
C c  3

np.diag will give you the diagonal values as a np array, you can then construct the multiindex by zipping the index and columns and pass this as the desired index in the DataFrame ctor.

Actually the complex multiindex generation doesn’t need to be so complicated:

In [18]:
pd.DataFrame(np.diag(df), index=[df.index, df.columns])

Out[18]:
     0
A a  2
B b  2
C c  3

But johnchase’s answer is neater

Answered By: EdChum

If you don’t mind using numpy you could use numpy.diag

pd.Series(np.diag(df), index=[df.index, df.columns])

A  a    2
B  b    2
C  c    3
dtype: int64
Answered By: johnchase

You can also use iat in a list comprehension to get the diagonal.

>>> pd.Series([df.iat[n, n] for n in range(len(df))], index=[df.index, df.columns]) 
A  a    2
B  b    2
C  c    3
dtype: int64
Answered By: Alexander
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.