pandas DataFrame diagonal
Question:
What is an efficient way to get the diagonal of a square DataFrame
. I would expect the result to be a Series
with a MultiIndex
with two levels, the first being the index of the DataFrame
the second level being the columns of the DataFrame
.
Setup
import pandas as pd
import numpy as np
np.random.seed([3, 1415])
df = pd.DataFrame(np.random.rand(3, 3) * 5,
columns = list('abc'),
index = list('ABC'),
dtype=np.int64
)
I want to see this:
print df.stack().loc[[('A', 'a'), ('B', 'b'), ('C', 'c')]]
A a 2
B b 2
C c 3
Answers:
You could do something like this:
In [16]:
midx = pd.MultiIndex.from_tuples(list(zip(df.index,df.columns)))
pd.DataFrame(data=np.diag(df), index=midx)
Out[16]:
0
A a 2
B b 2
C c 3
np.diag
will give you the diagonal values as a np array, you can then construct the multiindex by zipping the index and columns and pass this as the desired index in the DataFrame
ctor.
Actually the complex multiindex generation doesn’t need to be so complicated:
In [18]:
pd.DataFrame(np.diag(df), index=[df.index, df.columns])
Out[18]:
0
A a 2
B b 2
C c 3
But johnchase’s answer is neater
If you don’t mind using numpy you could use numpy.diag
pd.Series(np.diag(df), index=[df.index, df.columns])
A a 2
B b 2
C c 3
dtype: int64
You can also use iat
in a list comprehension to get the diagonal.
>>> pd.Series([df.iat[n, n] for n in range(len(df))], index=[df.index, df.columns])
A a 2
B b 2
C c 3
dtype: int64
What is an efficient way to get the diagonal of a square DataFrame
. I would expect the result to be a Series
with a MultiIndex
with two levels, the first being the index of the DataFrame
the second level being the columns of the DataFrame
.
Setup
import pandas as pd
import numpy as np
np.random.seed([3, 1415])
df = pd.DataFrame(np.random.rand(3, 3) * 5,
columns = list('abc'),
index = list('ABC'),
dtype=np.int64
)
I want to see this:
print df.stack().loc[[('A', 'a'), ('B', 'b'), ('C', 'c')]]
A a 2
B b 2
C c 3
You could do something like this:
In [16]:
midx = pd.MultiIndex.from_tuples(list(zip(df.index,df.columns)))
pd.DataFrame(data=np.diag(df), index=midx)
Out[16]:
0
A a 2
B b 2
C c 3
np.diag
will give you the diagonal values as a np array, you can then construct the multiindex by zipping the index and columns and pass this as the desired index in the DataFrame
ctor.
Actually the complex multiindex generation doesn’t need to be so complicated:
In [18]:
pd.DataFrame(np.diag(df), index=[df.index, df.columns])
Out[18]:
0
A a 2
B b 2
C c 3
But johnchase’s answer is neater
If you don’t mind using numpy you could use numpy.diag
pd.Series(np.diag(df), index=[df.index, df.columns])
A a 2
B b 2
C c 3
dtype: int64
You can also use iat
in a list comprehension to get the diagonal.
>>> pd.Series([df.iat[n, n] for n in range(len(df))], index=[df.index, df.columns])
A a 2
B b 2
C c 3
dtype: int64