How to get first n characters from another column that doesn't contain specific characters
Question:
I have this dataframe
ID
product name
1BJM10
1BJM10_RS2022_PK
L_RS2022_PK
2PKL10_RS2022_PK
3BDG10_RS2022_PK
1BJM10
1BJM10_RS2022_PK
My desired output is like this
ID
product name
1BJM10
1BJM10_RS2022_PK
–
L_RS2022_PK
2PKL10
2PKL10_RS2022_PK
3BDG10
3BDG10_RS2022_PK
1BJM10
1BJM10_RS2022_PK
2nd row shouldn’t get the ID because is has "_" in the product name’s first 6 characters.
I have tried this code, but it doesn’t work
df.loc[df['ID'].isna()] = df['ID'].fillna(~df['product name'].str[:6].contains("_"))
Answers:
Chain both conditions by &
for bitwise AND
with helper Series
:
s = df['product name'].str[:6]
df.loc[df['ID'].isna() & ~s.str.contains("_"), 'ID'] = s
print (df)
ID product name
0 1BJM10 1BJM10_RS2022_PK
1 NaN L_RS2022_PK
2 2PKL10 2PKL10_RS2022_PK
3 3BDG10 3BDG10_RS2022_PK
4 1BJM10 1BJM10_RS2022_PK
Try:
df['ID'] = df['product name'].apply(lambda x: x[:x.find('_')] if x.find('_')>=6 else '')
I have this dataframe
ID | product name |
---|---|
1BJM10 | 1BJM10_RS2022_PK |
L_RS2022_PK | |
2PKL10_RS2022_PK | |
3BDG10_RS2022_PK | |
1BJM10 | 1BJM10_RS2022_PK |
My desired output is like this
ID | product name |
---|---|
1BJM10 | 1BJM10_RS2022_PK |
– | L_RS2022_PK |
2PKL10 | 2PKL10_RS2022_PK |
3BDG10 | 3BDG10_RS2022_PK |
1BJM10 | 1BJM10_RS2022_PK |
2nd row shouldn’t get the ID because is has "_" in the product name’s first 6 characters.
I have tried this code, but it doesn’t work
df.loc[df['ID'].isna()] = df['ID'].fillna(~df['product name'].str[:6].contains("_"))
Chain both conditions by &
for bitwise AND
with helper Series
:
s = df['product name'].str[:6]
df.loc[df['ID'].isna() & ~s.str.contains("_"), 'ID'] = s
print (df)
ID product name
0 1BJM10 1BJM10_RS2022_PK
1 NaN L_RS2022_PK
2 2PKL10 2PKL10_RS2022_PK
3 3BDG10 3BDG10_RS2022_PK
4 1BJM10 1BJM10_RS2022_PK
Try:
df['ID'] = df['product name'].apply(lambda x: x[:x.find('_')] if x.find('_')>=6 else '')