python pandas: case insensitive drop column

Question:

I have a df and I want to drop a column by label but in a case insensitive way. Note: I don’t want to change anything in my df so I’d like to avoid ‘str.lower’.

heres my df:

print df 

Name UnweightedBase  Base     q6a1    q6a2    q6a3    q6a4    q6a5   q6a6 eSubTotal
Name                                                                               
Base           1006  1006  100,00%  96,81%  96,81%  96,81%  96,81%  3,19%   490,44%
q6_6             31    32  100,00%       -       -       -       -      -         -
q6_3           1006  1006   43,44%  26,08%  13,73%   9,22%   4,34%  3,19%   100,00%
q6_4           1006  1006   31,78%  31,71%  20,09%  10,37%   2,87%  3,19%   100,00%

Is there any magic I can apply to the code below?

df.drop(['unWeightedbase', 'Q6A1'],1)
Asked By: Boosted_d16

||

Answers:

I think what you can do is create a function to perform the case-insensitive search for you:

In [90]:
# create a noddy df
df = pd.DataFrame({'UnweightedBase':np.arange(5)})
print(df.columns)
# create a list of the column names
col_list = list(df)
# define our function to perform the case-insensitive search
def find_col_name(name):
    try:
        # this uses a generator to find the index if it matches, will raise an exception if not found
        return col_list[next(i for i,v in enumerate(col_list) if v.lower() == name)]
    except:
        return ''
df.drop(find_col_name('unweightedbase'),1)
Index(['UnweightedBase'], dtype='object')
Out[90]:
Empty DataFrame
Columns: []
Index: [0, 1, 2, 3, 4]

my search code is attributed to this SO one: find the index of a string ignoring cases

Answered By: EdChum

A similar option to EdChum’s answer would be to define a general function that performs a case-insensitive search for a group of strings, and use that function to find the names of the columns to drop.

import pandas as pd

def find_case_insensitive(strings, search_for):
    """Find strings by searching for case-insensitive matches."""
    lowercase_search = [s.lower() for s in search_for]
    return [val for val in strings if val.lower() in lowercase_search]

df = pd.DataFrame(
    {
        "UnweightedBase": [1006, 31, 1006, 1006],
        "q6a1": [100.0, 100.0, 43.44, 31.78],
    }
)
empty_df = df.drop(
    columns=find_case_insensitive(df.columns, ["unWeightedbase", "Q6A1"])
)

(The function above will ignore any searched strings that didn’t have a match; depending on how the function will be used, a different behavior may be better.)

You could also define a helper function for dropping DataFrame columns using case-insensitive names.

def drop_columns_case_insensitive(df, cols):
    """Drop columns from a DataFrame using case-insensitive column names."""
    return df.drop(columns=find_case_insensitive(df.columns, cols))

empty_df = drop_columns_case_insensitive(df, ["unWeightedbase", "Q6A1"])
Answered By: Sky