How do i find the iloc of a row in pandas dataframe?

Question:

I have an indexed pandas dataframe. By searching through its index, I find a row of interest. How do I find out the iloc of this row?

Example:

dates = pd.date_range('1/1/2000', periods=8)
df = pd.DataFrame(np.random.randn(8, 4), index=dates, columns=['A', 'B', 'C', 'D'])
df
                   A         B         C         D
2000-01-01 -0.077564  0.310565  1.112333  1.023472
2000-01-02 -0.377221 -0.303613 -1.593735  1.354357
2000-01-03  1.023574 -0.139773  0.736999  1.417595
2000-01-04 -0.191934  0.319612  0.606402  0.392500
2000-01-05 -0.281087 -0.273864  0.154266  0.374022
2000-01-06 -1.953963  1.429507  1.730493  0.109981
2000-01-07  0.894756 -0.315175 -0.028260 -1.232693
2000-01-08 -0.032872 -0.237807  0.705088  0.978011

window_stop_row = df[df.index < '2000-01-04'].iloc[-1]
window_stop_row
Timestamp('2000-01-08 00:00:00', offset='D')
#which is the iloc of window_stop_row?
Asked By: lmsasu

||

Answers:

You could try looping through each row in the dataframe:

    for row_number,row in dataframe.iterrows():
        if row['column_header'] == YourValue:
            print row_number

This will give you the row with which to use the iloc function

Answered By: Tom Patel

IIUC you could call index for your case:

In [53]: df[df.index < '2000-01-04'].index[-1]
Out[53]: Timestamp('2000-01-03 00:00:00', offset='D') 

EDIT

I think @EdChums answer is what you want. Alternatively you could filter your dataframe with values which you get, then use all to find the row with that values and then pass it to the index:

In [67]: df == window_stop_row
Out[67]:
                A      B      C      D
2000-01-01  False  False  False  False
2000-01-02  False  False  False  False
2000-01-03   True   True   True   True
2000-01-04  False  False  False  False
2000-01-05  False  False  False  False
2000-01-06  False  False  False  False
2000-01-07  False  False  False  False
2000-01-08  False  False  False  False

In [68]: (df == window_stop_row).all(axis=1)
Out[68]:
2000-01-01    False
2000-01-02    False
2000-01-03     True
2000-01-04    False
2000-01-05    False
2000-01-06    False
2000-01-07    False
2000-01-08    False
Freq: D, dtype: bool

In [69]: df.index[(df == window_stop_row).all(axis=1)]
Out[69]: DatetimeIndex(['2000-01-03'], dtype='datetime64[ns]', freq='D')
Answered By: Anton Protopopov

You want the .name attribute and pass this to get_loc:

In [131]:
dates = pd.date_range('1/1/2000', periods=8)
df = pd.DataFrame(np.random.randn(8, 4), index=dates, columns=['A', 'B', 'C', 'D'])
df

Out[131]:
                   A         B         C         D
2000-01-01  0.095234 -1.000863  0.899732 -1.742152
2000-01-02 -0.517544 -1.274137  1.734024 -1.369487
2000-01-03  0.134112  1.964386 -0.120282  0.573676
2000-01-04 -0.737499 -0.581444  0.528500 -0.737697
2000-01-05 -1.777800  0.795093  0.120681  0.524045
2000-01-06 -0.048432 -0.751365 -0.760417 -0.181658
2000-01-07 -0.570800  0.248608 -1.428998 -0.662014
2000-01-08 -0.147326  0.717392  3.138620  1.208639

In [133]:    
window_stop_row = df[df.index < '2000-01-04'].iloc[-1]
window_stop_row.name

Out[133]:
Timestamp('2000-01-03 00:00:00', offset='D')

In [134]:
df.index.get_loc(window_stop_row.name)

Out[134]:
2

get_loc returns the ordinal position of the label in your index which is what you want:

In [135]:    
df.iloc[df.index.get_loc(window_stop_row.name)]

Out[135]:
A    0.134112
B    1.964386
C   -0.120282
D    0.573676
Name: 2000-01-03 00:00:00, dtype: float64

if you just want to search the index then so long as it is sorted then you can use searchsorted:

In [142]:
df.index.searchsorted('2000-01-04') - 1

Out[142]:
2
Answered By: EdChum

While pandas.Index.get_loc() will only work if you have a single key, the following paradigm will also work getting the iloc of multiple elements:

np.argwhere(condition).flatten()   # array of all iloc where condition is True

In your case, picking the latest element where df.index < '2000-01-04':

np.argwhere(df.index < '2000-01-04').flatten()[-1]  # returns 2
Answered By: ascripter
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.