Remove certain words from column names

Question:

I have transformed a dataset that has two categorical variables, Name and Year, into dummy variables. As a result I have 433 columns and I would like to know if there’s a way to remove the words "Name_" and "Year_" without having to rename all of them by hand.

The only results I’ve seen are to manually rename all columns. Is there a way to do this like if one were to remove certain keywords from a string/URL links within text?

Dataframe upon transformation

Asked By: syntax_of_vectors

||

Answers:

Might be more concise if you use a regex, but this should work:

out = df.rename(columns=lambda x: x[5:] if x.startswith("Name_") or x.startswith("Year_") else x)
Answered By: Chrysophylaxs

Using a regex:

df.columns = df.columns.str.replace('^(Name|Year)_', '', regex=True)
Answered By: mozway

Yes, there is a way to rename multiple columns in a Pandas DataFrame at once, without having to rename them individually.

Here is an example of how you can do this:

import pandas as pd

# Load your dataframe
df = pd.read_csv('my_data.csv')

# Get the list of column names
column_names = df.columns

# Create a new list of column names by removing the word "Name_" or "Year_" from the original column names
new_column_names = [name.replace('Name_', '').replace('Year_', '') for name in column_names]

# Assign the new list of column names to the dataframe
df.columns = new_column_names



Answered By: darkjant
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.