I was trying to Replace Header with First Row in Pandas Data frame. but getting an error said: AttributeError: 'list' object has no attribute 'iloc'

Question

I’m read a table from website using df = pd.read_html('website link'):

df = pd.read_html('w3schools.com/python/python_ml_decision_tree.asp')
df[0]

It successfully read the table but I want to replace the 1st row as the header.
I’m using this code:

df.columns = df.iloc[0] 
df = df[1:]
df.head()

but it gave me an error that said:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-11-f9b2cba2eb0b> in <module>
----> 1 df.columns = df.iloc[0]   #grab the first row for the header
      2 df = df[1:]               #take the data less the header row
      3 df

AttributeError: 'list' object has no attribute 'iloc'

Asked By: Mehedi Azad

||

Source

Answer 1

.read_html returns a list. You would want to .concat them first:

dfs = pd.read_html(url)
df = pd.concat(dfs)

And finally replace headers with first row:

df = df.rename(columns=df.iloc[0]).drop(df.index[0])

Answered By: snake_charmer_775

Answer 2

Try this:

df = pd.read_html('https://www.w3schools.com/python/python_ml_decision_tree.asp')
df[0].columns = df[0].iloc[0]
df = df[0][1:]

Answered By: Luis Alejandro Vargas Ramos

Answer 3

In your code df = pd.read_html('website link'), the function pd.read_html() will output a list, so the variable name df is not suitable, and could be confusing.

Here’s how I would do it, hope that it’s clear:

import pandas as pd

lis = pd.read_html('https://www.w3schools.com/python/python_ml_decision_tree.asp')
df = pd.DataFrame(lis[0]) #lis has only 1 element
df.columns = df.iloc[0]   #grab the first row for the header
df = df[1:]               #take the data less the header row
print(df)

0  Age Experience Rank Nationality   Go
1   36         10    9          UK   NO
2   42         12    4         USA   NO
3   23          4    6           N   NO
4   52          4    4         USA   NO
5   43         21    8         USA  YES
6   44         14    5          UK   NO
7   66          3    7           N  YES
8   35         14    9          UK  YES
9   52         13    7           N  YES
10  35          5    9           N  YES
11  24          3    5         USA   NO
12  18          3    7          UK  YES
13  45          9    9          UK  YES

Answered By: perpetualstudent

Answer 4

Use:

df = pd.read_html('https://www.w3schools.com/python/python_ml_decision_tree.asp')

Based on the documentation:

Read HTML tables into a list of DataFrame objects.

So:

type(df)

returns:

list

and:

len(df)

1

So,

df[0]

returns:

    0   1   2   3   4
0   Age Experience  Rank    Nationality Go
1   36  10  9   UK  NO
2   42  12  4   USA NO
3   23  4   6   N   NO
4   52  4   4   USA NO
5   43  21  8   USA YES
6   44  14  5   UK  NO
7   66  3   7   N   YES
8   35  14  9   UK  YES
9   52  13  7   N   YES
10  35  5   9   N   YES
11  24  3   5   USA NO
12  18  3   7   UK  YES
13  45  9   9   UK  YES

Which is a df and you can use your iloc.

Answered By: keramat

I was trying to Replace Header with First Row in Pandas Data frame. but getting an error said: AttributeError: 'list' object has no attribute 'iloc'

Question:

Answers: