How to delete the current row in pandas dataframe during df.iterrows()

Question

I would like to delete the current row during iteration – using df.iterrows(), if it its certain column fails on my if condition.

ex.

for index, row in df:
     if row['A'] == 0:
          #remove/drop this row from the df
          del df[index] #I tried this but it gives me an error

This might be a very easy one, but i still can’t figure out how to do it.
Your help will be very much appreciated!

Asked By: JCm

||

Source

Answer 1

I don’t know if this is pseudo code or not but you can’t delete a row like this, you can drop it:

In [425]:

df = pd.DataFrame({'a':np.random.randn(5), 'b':np.random.randn(5)})
df
Out[425]:
          a         b
0 -1.348112  0.583603
1  0.174836  1.211774
2 -2.054173  0.148201
3 -0.589193 -0.369813
4 -1.156423 -0.967516
In [426]:

for index, row in df.iterrows():
    if row['a'] > 0:
        df.drop(index, inplace=True)
In [427]:

df
Out[427]:
          a         b
0 -1.348112  0.583603
2 -2.054173  0.148201
3 -0.589193 -0.369813
4 -1.156423 -0.967516

if you just want to filter those rows out you can perform boolean indexing:

df[df['a'] <=0]

would achieve the same thing

Answered By: EdChum

Answer 2

I tried @EdChum solution with a custom pandas.DataFrame, but I did not get it working as an error was raising: KeyError: '[78] not found in axis'. So on, if you got the same error, it can be fixed dropping the index of the dataframe on the specified index on each .iterrows() iteration.

The dataframe used was retrieved from investpy which contains all the equities/stock data indexed in Investing.com, and the print function is the one implemented in pprint. Anyways, this is the piece of code to get it working:

In [1]:

import investpy
from pprint import pprint

In [2]:

df = investpy.get_equities()

pprint(df.head())

Out [2]:

     country               name                           full_name  
0  argentina            Tenaris                             Tenaris   
1  argentina       PETROBRAS ON     Petroleo Brasileiro - Petrobras   
2  argentina     GP Fin Galicia          Grupo Financiero Galicia B   
3  argentina  Ternium Argentina  Ternium Argentina Sociedad Anónima   
4  argentina      Pampa Energía                  Pampa Energía S.A.   

                      tag          isin     id currency  
0       tenaris?cid=13302  LU0156801721  13302      ARS  
1  petrobras-on?cid=13303  BRPETRACNOR9  13303      ARS  
2          gp-fin-galicia  ARP495251018  13304      ARS  
3                 siderar  ARSIDE010029  13305      ARS  
4           pampa-energia  ARP432631215  13306      ARS  

In [3]:

pprint(df[df['tag'] == 'koninklijke-philips-electronics'])

Out [3]:

      country                     name                   full_name  
78  argentina  Koninklijke Philips DRC  Koninklijke Philips NV DRC   

                                tag          isin     id currency  
78  koninklijke-philips-electronics  ARDEUT110558  30044      ARS  

In [4]:

for index, row in df.iterrows():
    if row['tag'] == 'koninklijke-philips-electronics':
        df.drop(df.index[index], inplace=True)

In [5]:

pprint(df[df['tag'] == 'koninklijke-philips-electronics'])

Out [5]:

Empty DataFrame
Columns: [country, name, full_name, tag, isin, id, currency]
Index: []

Hope this helped someone! Also thank you anyways for the original answer @EdChum!

Answered By: alvarobartt

How to delete the current row in pandas dataframe during df.iterrows()

Question:

Answers: