Remove top row from a dataframe
Question:
I have a dataframe that looks like this:
level_0 level_1 Repo Averages for 27 Jul 2018
0 Business Date Instrument Ccy
1 27/07/2018 GC_AUSTRIA_SUB_10YR EUR
2 27/07/2018 R_RAGB_1.15_10/18 EUR
3 27/07/2018 R_RAGB_4.35_03/19 EUR
4 27/07/2018 R_RAGB_1.95_06/19 EUR
I am trying to get rid of the top row and only keep
Business Date Instrument Ccy
0 27/07/2018 GC_AUSTRIA_SUB_10YR EUR
1 27/07/2018 R_RAGB_1.15_10/18 EUR
2 27/07/2018 R_RAGB_4.35_03/19 EUR
3 27/07/2018 R_RAGB_1.95_06/19 EUR
I tried df.columns.droplevel(0)
but not successful any help is more than welcome
Answers:
You can try so:
df.columns = df.iloc[0]
df = df.reindex(df.index.drop(0)).reset_index(drop=True)
df.columns.name = None
Output:
Business Date Instrument Ccy
0 27/07/2018 GC_AUSTRIA_SUB_10YR EUR
1 27/07/2018 R_RAGB_1.15_10/18 EUR
2 27/07/2018 R_RAGB_4.35_03/19 EUR
3 27/07/2018 R_RAGB_1.95_06/19 EUR
df.drop(row_start, row_end)
This will help
You can take advantage of the parameter header
(Read here more about the header parameter in pandas).
Let’s say that you have the following dataset
df = pd.read_csv("Prices.csv")
print(df)
That outputs
0 1 2 3 4
0 DATA SESSAO HORA PRECO_PT PRECO_ES
1 1/1/2020 0 1 41,88 41,88
2 1/1/2020 0 2 38,60 38,60
3 1/1/2020 0 3 36,55 36,55
By simply passing the header = 0
like this
df = pd.read_csv("Prices.csv", header=0)
print(df)
You will get what you want
DATA SESSAO HORA PRECO_PT PRECO_ES
0 1/1/2009 0 1 55,01 55,01
1 1/1/2009 0 2 56,13 56,13
2 1/1/2009 0 3 50,59 50,59
3 1/1/2009 0 4 45,83 45,83
4 1/1/2009 0 5 42,07 41,90
You can try using slicing.
df = df[1:]
This will remove the first row of your dataframe.
I tested the comment by jeremycg. It works very well and is succinct. Just want more people to see, here it is again –
my_df = pd.read_csv(r"C:pathtomyfile.csv", skiprows = 1)
I have a dataframe that looks like this:
level_0 level_1 Repo Averages for 27 Jul 2018
0 Business Date Instrument Ccy
1 27/07/2018 GC_AUSTRIA_SUB_10YR EUR
2 27/07/2018 R_RAGB_1.15_10/18 EUR
3 27/07/2018 R_RAGB_4.35_03/19 EUR
4 27/07/2018 R_RAGB_1.95_06/19 EUR
I am trying to get rid of the top row and only keep
Business Date Instrument Ccy
0 27/07/2018 GC_AUSTRIA_SUB_10YR EUR
1 27/07/2018 R_RAGB_1.15_10/18 EUR
2 27/07/2018 R_RAGB_4.35_03/19 EUR
3 27/07/2018 R_RAGB_1.95_06/19 EUR
I tried df.columns.droplevel(0)
but not successful any help is more than welcome
You can try so:
df.columns = df.iloc[0]
df = df.reindex(df.index.drop(0)).reset_index(drop=True)
df.columns.name = None
Output:
Business Date Instrument Ccy
0 27/07/2018 GC_AUSTRIA_SUB_10YR EUR
1 27/07/2018 R_RAGB_1.15_10/18 EUR
2 27/07/2018 R_RAGB_4.35_03/19 EUR
3 27/07/2018 R_RAGB_1.95_06/19 EUR
df.drop(row_start, row_end)
This will help
You can take advantage of the parameter header
(Read here more about the header parameter in pandas).
Let’s say that you have the following dataset
df = pd.read_csv("Prices.csv")
print(df)
That outputs
0 1 2 3 4
0 DATA SESSAO HORA PRECO_PT PRECO_ES
1 1/1/2020 0 1 41,88 41,88
2 1/1/2020 0 2 38,60 38,60
3 1/1/2020 0 3 36,55 36,55
By simply passing the header = 0
like this
df = pd.read_csv("Prices.csv", header=0)
print(df)
You will get what you want
DATA SESSAO HORA PRECO_PT PRECO_ES
0 1/1/2009 0 1 55,01 55,01
1 1/1/2009 0 2 56,13 56,13
2 1/1/2009 0 3 50,59 50,59
3 1/1/2009 0 4 45,83 45,83
4 1/1/2009 0 5 42,07 41,90
You can try using slicing.
df = df[1:]
This will remove the first row of your dataframe.
I tested the comment by jeremycg. It works very well and is succinct. Just want more people to see, here it is again –
my_df = pd.read_csv(r"C:pathtomyfile.csv", skiprows = 1)