Using Pandas to dynamically replace values found in other columns

Question

I have a dataset looks like this:

Car	Make	Engine
Toyota Rav 4 8cyl6L	Toyota	8cyl6L
Mitsubishi Eclipse 2.1T	Mitsubishi	2.1T
Monster Gravedigger 25Lsc	Monster	25Lsc

The data was clearly concatenated from Make + Model + Engine at some point but the car Model was not provided to me.

I’ve been trying to use Pandas to say that if we take Car, replace instances of Make with a nothing, replace instances of Engine with nothing, then trim the spaces around the result, we will have Model.

Car	Make	Model	Engine
Toyota Rav 4 8cyl6L	Toyota	Rav 4	8cyl6L
Mitsubishi Eclipse 2.1T	Mitsubishi	Eclipse	2.1T
Monster Gravedigger 25Lsc	Monster	Gravedigger	25Lsc

There’s something I’m doing wrong when I’m trying to reference another column in this manner.

df['Model'] = df['Car'].str.replace(df['Make'],'')

gives me an error of "unhashable type: ‘Series’". I’m guessing I’m accidentally inputting the entire ‘Make’ column.

At every row I want to make a different substitution using data from other columns in that row. How would I accomplish this?

Asked By: bobh2os

||

Source

Answer 1

you can use:

df['Model']=df.apply(lambda x: x['Car'].replace(x['Make'],"").replace(x['Engine'],""),axis=1)
print(df)
'''
                         Car        Make          Model  Engine
0        Toyota Rav 4 8cyl6L      Toyota         Rav 4   8cyl6L
1    Mitsubishi Eclipse 2.1T  Mitsubishi       Eclipse     2.1T
2  Monster Gravedigger 25Lsc     Monster   Gravedigger    25Lsc
'''

Answered By: Clegane

Answer 2

A regex proposition using re.sub :

import re

df['Model'] = [re.sub(f'{b}|{c}', '', a) for a,b,c in zip(df['Car'], df['Make'], df["Engine"])]

# Output :

print(df)

                         Car        Make          Model  Engine
0        Toyota Rav 4 8cyl6L      Toyota         Rav 4   8cyl6L
1    Mitsubishi Eclipse 2.1T  Mitsubishi       Eclipse     2.1T
2  Monster Gravedigger 25Lsc     Monster   Gravedigger    25Lsc

Answered By: abokey

Using Pandas to dynamically replace values found in other columns

Question:

Answers:

# Output :