Split and type cast columns values using Pandas

Question:

How do i add an extra column in a dataframe, so it could split and convert to integer types but np.nan for string types

Col1   
1|2|3
"string"

so

Col1      ExtraCol
1|2|3     [1,2,3]
"string"  nan

I tried long contorted way but failed

df['extracol'] = df["col1"].str.strip().str.split("|").str[0].apply(lambda x: x.astype(np.float) if x.isnumeric() else np.nan).astype("Int32")
Asked By: Rohit Sharma

||

Answers:

You can use regex and Series.str.match to find the rows whose value can be split into integer lists

df['ExtraCol'] = df.loc[df['Col1'].str.match(r'|?d+|?'), 'Col1'].str.split('|')
Answered By: SomeDude

Another possible solution:

import re

df['ExtraCol'] = df['Col1'].apply(lambda x: [int(y) for y in re.split(
    r'|', x)] if x.replace('|', '').isnumeric() else np.nan)

Output:

     Col1   ExtraCol
0   1|2|3  [1, 2, 3]
1  string        NaN
Answered By: PaulS
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.