Looping through rows based on column values

Question:

Trying to loop through a report and eliminate/hide/replace cell values if they are repeated in the row above. This is conditional to certain columns in the row but not the entire row as each row will contain at least 1 piece of data that is unique to the row. I know I am close but I’m missing my mark and looking for a nudge in the right direction. Trying to eliminate redundant information to increase legibility of the final report. Essentially what I am trying to do is:

for cell in row:
    if column["column_name"] == (line above):
        cell.value = " "

Because each row has a unique piece of data drop duplicates does not work.
Once I can clear the intended column in each row where applicable I will expand the process to loop through and apply to other columns where the initial is blanked out. I should be able to work that out once the first domino falls. Any advice is appreciated.

I’ve tried

np.where(cell) = [iloc-1]

and

masking based on the same parameter.

I get errors that ‘row’ and ‘iloc’ are undefined or None of [Index (all content)] are in the [index].

Asked By: Matthew Phillips

||

Answers:

You can use shift() to compare the row elements. If I understand your issue then the example code below indicates an approach you can use (it replaces duplicated numbers by 0):

import pandas as pd

df = pd.DataFrame({ 'A': [1, 2, 2, 4, 5],
                    'B': ['a', 'b', 'c', 'd', 'e']
                    })


df['A'] = df['A'].where(df['A'] != df.shift(-1)['A'], 0)

print(df)
Answered By: user19077881
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.