Pandas – Drop function error (label not contained in axis)

Question:

I have a CSV file that is as the following:

index,Avg,Min,Max
Build1,56.19,39.123,60.1039
Build2,57.11,40.102,60.2
Build3,55.1134,35.129404123,60.20121

Based off my question here I am able to add some relevant information to this csv via this short script:

import pandas as pd

df = pd.read_csv('newdata.csv')
print(df)

df_out = pd.concat([df.set_index('index'),df.set_index('index').agg(['max','min','mean'])]).rename(index={'max':'Max','min':'Min','mean':'Average'}).reset_index()

with open('newdata.csv', 'w') as f:
    df_out.to_csv(f,index=False)

This results in this CSV:

index,Avg,Min,Max
Build1,56.19,39.123,60.1039
Build2,57.11,40.102,60.2
Build3,55.1134,35.129404123,60.20121
Max,57.11,40.102,60.20121
Min,55.1134,35.129404123,60.1039
Average,56.1378,38.1181347077,60.16837

I would like to now have it so I can update this csv. For example if I ran a new build (build4 for instance) I could add that in and then redo the Max, Min, Average rows. My idea is that I therefore delete the rows with labels Max, Min, Average, add my new row, redo the stats. I believe the code I need is as simple as (just for Max but would have lines for Min and Average as well):

df = pd.read_csv('newdata.csv')
df = df.drop('Max')

However this always results in an ValueError: labels [‘Max’] not contained in axis

I have created the csv files in sublime text, could this be part of the issue? I have read other SO posts about this and none seem to help my issue.

I am unsure if this allowed but here is a download link to my csv just in case something is wrong with the file itself.

I would be okay with two possible answers:

  1. How to fix this drop issue
  2. How to add more builds and update the statistics (a method without drop)
Asked By: Abdall

||

Answers:

You must specify the axis argument. default is axis = 0 which is rows columns is axis = 1.

so this should be your code.

df = df.drop('Max',axis=1)

edit:
looking at this piece of code:

df = pd.read_csv('newdata.csv')
df = df.drop('Max')

The code you used does not specify that the first column of the csv file contains the index for the dataframe. Thus pandas creates an index on the fly. This index is purely a numerical one. So your index does not contain “Max”.

try the following:

df = pd.read_csv("newdata.csv",index_col=0)
df = df.drop("Max",axis=0)

This forces pandas to use the first column in the csv file to be used as index. This should mean the code works now.

Answered By: error

To delete a particular column in pandas; do simply:

del df['Max']
Answered By: glegoux
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.