Create a new column if ends with certain string

Question:

I have a data frame and a list. I want to check if strings in column ends with anything in my list. I want to create a new column showing if column ends with anything in the list then value is “Y”, other wiese “N”. my data frame Data looks like following:

import pandas as pd
city = ['New York', 'Los Angeles','Buffalo','Miami','San Deigo', 'San 
Francisco']
population = ['8.5','3.9','0.25','0.45','1.4','0.87']
df = pd.DataFrame({'city':city,'population':population})

ending = ['les','sco', 'igo']

Expected result should looks like this:

city          population    flag
New York       8.5          N
Los Angeles    3.9          Y
Buffalo        0.25         N
Miami          0.45         N
San Deigo      1.4          Y
San Francisco  0.87         Y

I tried to use if statement:

if df['city'].str.endswith(tuple(ending)):
   val = 'Y'
elif df['city'].str.endswith(tuple(ending)):
    val= 'Y'
else:
   val = 'N'

I get error message:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Any suggestion? Thank!

Asked By: Bonjuga Lewis

||

Answers:

Assuming the ending is always a three character string, you could use:

df['flag']=df['city'].map(lambda x: x[-3:] in ending) 

which produces

            city population   flag
0       New York        8.5  False
1    Los Angeles        3.9   True
2        Buffalo       0.25  False
3          Miami       0.45  False
4      San Deigo        1.4   True
5  San Francisco       0.87   True

if you really need the binary outcome to be Y/N instead of True/False you could perform another map:

def bin(arg):
    if arg==True:
        return 'Y'
    return 'F'

df.flag=df.flag.map(lambda x: bin(x))

which results in

            city population flag
0       New York        8.5    F
1    Los Angeles        3.9    Y
2        Buffalo       0.25    F
3          Miami       0.45    F
4      San Deigo        1.4    Y
5  San Francisco       0.87    Y
Answered By: DWD

The any built-in function can help.

val = 'Y' if any(df['city'].endswith(e) for e in ending) else 'N'
Answered By: Patti

You can use pd.Series.isin followed by pd.Series.map with a dictionary mapping. This solution tests specifically the last 3 characters. Otherwise, use @Wen’s solution.

ending = ['les', 'sco', 'igo']
mapper = {True: 'Y', False: 'N'}

df['flag'] = df['city'].str[-3:].isin(ending).map(mapper)

print(df)

            city population flag
0       New York        8.5    N
1    Los Angeles        3.9    Y
2        Buffalo       0.25    N
3          Miami       0.45    N
4      San Deigo        1.4    Y
5  San Francisco       0.87    Y
Answered By: jpp

Using str.endswith, this dose not required the same length string in ending

df.city.str.endswith(tuple(ending)).map({True:'Y',False:'N'})
0    N
1    Y
2    N
3    N
4    Y
5    Y
Name: city, dtype: object
Answered By: BENY
import numpy as np

col = "city"
conditions = [
    df[col].str.endswith(tuple(ending)),
    ~df[col].str.endswith(tuple(ending)),
]
choices = ["Y", "F"]
df["flag"] = np.select(conditions, choices, default=np.nan)
Answered By: trying_to_be_a_dev