Remove Unnamed columns in pandas dataframe

Question:

I have a data file from columns A-G like below but when I am reading it with pd.read_csv('data.csv') it prints an extra unnamed column at the end for no reason.

colA    ColB    colC    colD    colE    colF    colG    Unnamed: 7
44      45      26      26      40      26      46        NaN
47      16      38      47      48      22      37        NaN
19      28      36      18      40      18      46        NaN
50      14      12      33      12      44      23        NaN
39      47      16      42      33      48      38        NaN

I have seen my data file various times but I have no extra data in any other column. How I should remove this extra column while reading ? Thanks

Asked By: muazfaiz

||

Answers:

df = df.loc[:, ~df.columns.str.contains('^Unnamed')]

In [162]: df
Out[162]:
   colA  ColB  colC  colD  colE  colF  colG
0    44    45    26    26    40    26    46
1    47    16    38    47    48    22    37
2    19    28    36    18    40    18    46
3    50    14    12    33    12    44    23
4    39    47    16    42    33    48    38

NOTE: very often there is only one unnamed column Unnamed: 0, which is the first column in the CSV file. This is the result of the following steps:

  1. a DataFrame is saved into a CSV file using parameter index=True, which is the default behaviour
  2. we read this CSV file into a DataFrame using pd.read_csv() without explicitly specifying index_col=0 (default: index_col=None)

The easiest way to get rid of this column is to specify the parameter pd.read_csv(..., index_col=0):

df = pd.read_csv('data.csv', index_col=0)

First, find the columns that have ‘unnamed’, then drop those columns. Note: You should Add inplace = True to the .drop parameters as well.

df.drop(df.columns[df.columns.str.contains('unnamed',case = False)],axis = 1, inplace = True)
Answered By: Adil Warsi

The pandas.DataFrame.dropna function removes missing values (e.g. NaN, NaT).

For example the following code would remove any columns from your dataframe, where all of the elements of that column are missing.

df.dropna(how='all', axis='columns')
Answered By: Susan

The approved solution doesn’t work in my case, so my solution is the following one:

    ''' The column name in the example case is "Unnamed: 7"
 but it works with any other name ("Unnamed: 0" for example). '''

        df.rename({"Unnamed: 7":"a"}, axis="columns", inplace=True)

        # Then, drop the column as usual.

        df.drop(["a"], axis=1, inplace=True)

Hope it helps others.

Answered By: Ezarate11
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.