Pandas assign value to dataframe column based on two lists using one column of the dataframe as index

Question:

I have the following lists: brands_list = {"b1": {"name": "brand1"}, "b2". {"name": "brand2"}} and actual_brands = ["brand1"], and a Pandas dataframe with a column brand with the following content: "b1", "b1", "b1", "b2", "b1", and I want to assign a value to column is_brand_present if the element of brands_list with index of column brand is in actual_brands.

I try the following using numpy’s where:

brands_list = {"b1": "brand1", "b2". "brand2"}
actual_brands = ["brand1"]

data_frame["is_brand_present"] = np.where(
    brands_list[data_frame["brand"]].isin(actual_brands), 1, 0
)

I expect the content of column is_brand_present to be 1,1,1,0,1, but I’m getting this error:

TypeError: unhashable type: 'Series'

How can I make the evaluation of the condition?

Asked By: HuLu ViCa

||

Answers:

IIUC , you are looking for (map ‘brand’ column to dictionary and check if its in actual brands)

df['is_brand_present'] = df['brand'].map(brands_list).isin(actual_brands).astype(int)

If your brands_list is nested as per your update to question, you can use:

df['brand'].map({k: v['name']
                 for k, v in brands_list.items()
                 }).isin(actual_brands).astype(int)

print(df):

  brand  is_brand_present
0    b1                 1
1    b1                 1
2    b1                 1
3    b2                 0
4    b1                 1
Answered By: SomeDude

We can just do

l = [x for x, y in brands_list.items() if y['name'] in actual_brands ]


df['is_brand_present'] = df.brand.isin(l).astype(int)
Answered By: BENY
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.