Rename columns based on certain pattern

Question

I have the following columns in a dataframe: Id, category2, Brandyqdy1, Brandyqwdwdy2, Brandyqdw3

If the column’s name starts with Brand and ends with 1, I need it renamed as Vans. Similarly, for other Brand columns, use the following:
rename_brands = {'1': 'Vans', '2': 'Nike', 3:'Adidas'}

Also, I will be renaming other columns apart from the ones that start with Brand, overall:
rename_columns = {'Id': 'record', 'Category2': 'Sku', '1': 'Vans', '2': 'Nike', 3:'Adidas'}

Asked By: M J

||

Source

Answer 1

You can chain the two rename method. For regex rename, you can use re.sub

import re


rename_brands = {'1': 'Vans', '2': 'Nike', 3:'Adidas'}
rename_columns = {'Id': 'record', 'Category2': 'Sku', '1': 'Vans', '2': 'Nike', '3':'Adidas'}

out = (df.rename(columns=rename_columns)
       .rename(columns=lambda col: re.sub('^Brand.*(d)$',
                                          lambda m: rename_brands.get(m.group(1), m.group(0)),
                                          col)))

$ print(df)

   Id  Category2  Brandyqdy1  Brandyqwdwdy2  Brandyqdw3   1   2
0 NaN        NaN         NaN            NaN         NaN NaN NaN


$ print(out)

   record  Sku  Vans  Nike  Brandyqdw3  Vans  Nike
0     NaN  NaN   NaN   NaN         NaN   NaN   NaN

Answered By: Ynjxsjmh

Answer 2

Solution

Select the columns that do not contain ‘Brand’ from the dataframe as df1. Instead, include ‘Brand’ as df2.
Use a for loop to replace the columns ending with numbers in df2 corresponding to the brands dictionary.
Join the df1 and the df2 together.

Sample Code

import pandas as pd

df = pd.DataFrame({
    'Id':['001', '002'], 
    'Category':['A', 'S'],
    'Brandtxsu1':[1, 1],
    'Brandxyw2':[2, 2]
})

print(df)

print('------------------------------------')

brands = {'1': 'Vans', '2': 'Nike'}

df1 = df[['Id', 'Category']].rename(columns={'Category': 'Record'})

df2 = df.loc[:, df.columns.str.startswith('Brand')]

for i in range(1,3):
    df2 = df2.rename(columns={df2.loc[:, df2.columns.str.endswith(str(i))].columns.values[0]: brands[str(i)]})

df_output = df1.join(df2)

print(df_output)

Output

    Id Category  Brandtxsu1  Brandxyw2
0  001        A           1          2
1  002        S           1          2
------------------------------------
    Id Record  Vans  Nike
0  001      A     1     2
1  002      S     1     2

Answered By: Brian.Z

Rename columns based on certain pattern

Question:

Answers: