How do I delete/ignore some rows while parsing data from Excel using Python

Question:

Recently I am trying to parse data from Excel sheet using Python and I successfully parsed it but I don’t need some rows from that Excel sheet. So how do I do it(may be using looping)? Here the code which I wrote to parse the Excel sheet:

import xlrd

book = xlrd.open_workbook("Excel.xlsx")

sheet = book.sheet_by_index(0)
firstcol = sheet.col_values(0)
data = [[sheet.cell_value(r, c) for c in range(sheet.ncols)] for r in 
        range(sheet.nrows)]

ele=''
year=[]

for j in range(len(data)):
    if j==1:
        year=data[j]
    if j>2:
        ele=data[j][0]

        for i in range(1, len(data[j])):
            if ele != "":
                if data[j][i] != "":
                    if year[i] !="":
                        print([ele, data[j][i], year[i]])

With that all rows are parsing in list format which I want, but I don’t want some rows**( Like Total age, Total IDs, Total Result)** from Excel file, So how can I implement it in the same code or suggest some other effective way(may be pandas) to reduce code or any powerful way. The Excel file to which I’m referring:
Click to see Excel.xlsx

Thanks in Advance…

Asked By: royal

||

Answers:

If I understand correctly, you can do this much more simply. You have some list of rows to exclude:

rows_to_exclude = ['Total age', 'Total IDS', 'Total Result']

You can read in the dataframe using pd.read_excel without xlrd (no need to specify the sheet index if it’s the first sheet, which is read by default). Then you can drop the rows with missing values, and drop all rows whose index is in your list of excluded row labels:

df = pd.read_excel('Excel.xlsx')
df = df.dropna().drop(rows_to_exclude)
Answered By: ASGM
Categories: questions Tags: , , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.