How can I check if a Pandas dataframe's index is sorted
Question:
I have a vanilla pandas dataframe with an index. I need to check if the index is sorted. Preferably without sorting it again.
e.g. I can test an index to see if it is unique by index.is_unique() is there a similar way for testing sorted?
Answers:
If sort
is all allowed, try
all(df.sort_index().index == df.index)
If not, try
all(a <= b for a, b in zip(df.index, df.index[1:]))
The first one is more readable while the second one has smaller time complexity.
EDIT
Add another method I’ve just found. Similar with the second one but the comparison is vetorized
all(df.index[:-1] <= df.index[1:])
How about:
df.index.is_monotonic
For non-indices:
df.equals(df.sort())
Just for the sake of completeness, this would be the procedure to check whether the dataframe index is monotonic increasing and also unique, and, if not, make it to be:
if not (df.index.is_monotonic_increasing and df.index.is_unique):
df.reset_index(inplace=True, drop=True)
NOTE df.index.is_monotonic_increasing
is returning True
even if there are repeated indices, so it has to be complemented with df.index.is_unique
.
API References
I have a vanilla pandas dataframe with an index. I need to check if the index is sorted. Preferably without sorting it again.
e.g. I can test an index to see if it is unique by index.is_unique() is there a similar way for testing sorted?
If sort
is all allowed, try
all(df.sort_index().index == df.index)
If not, try
all(a <= b for a, b in zip(df.index, df.index[1:]))
The first one is more readable while the second one has smaller time complexity.
EDIT
Add another method I’ve just found. Similar with the second one but the comparison is vetorized
all(df.index[:-1] <= df.index[1:])
How about:
df.index.is_monotonic
For non-indices:
df.equals(df.sort())
Just for the sake of completeness, this would be the procedure to check whether the dataframe index is monotonic increasing and also unique, and, if not, make it to be:
if not (df.index.is_monotonic_increasing and df.index.is_unique):
df.reset_index(inplace=True, drop=True)
NOTE
df.index.is_monotonic_increasing
is returningTrue
even if there are repeated indices, so it has to be complemented withdf.index.is_unique
.