Pandas split column if condition, else null

Question:

I want to split a column. If it has a letter (any letter) at the end, this will be the value for the second column. Otherwise, the second column should be null

import pandas as pd
data = pd.DataFrame({"data": ["0.00I", "0.01E", "99.99", "0.14F"]})

desired result:

    a    b
0 0.00   I
1 0.01   E
2 99.99  None
3 0.14   F
Asked By: OSUDamian

||

Answers:

You can use str.extract with the (d+(?:.d+)?)(D)? regex:

out = data['data'].str.extract(r'(d+(?:.d+)?)(D)?').set_axis(['a', 'b'], axis=1)

Or, if you want to remove the original ‘data’ column while adding new columns in place:

data[['a', 'b']] = data.pop('data').str.extract('(d+(?:.d+)?)(D)?')

output:

       a    b
0   0.00    I
1   0.01    E
2  99.99  NaN
3   0.14    F

regex demo

(d+(?:.d+)?)  # capture a number (with optional decimal)
(D)?            # optionally capture a non-digit
Answered By: mozway

try using a regular expression and apply

def find_numeric(x):
    if re.search("[[0-9]+.[0-9]+", x):
        return re.search("[0-9]+.[0-9]+", x).group()
    else:
        return ""

def find_alpha(x):
    if re.search("[a-zA-Z]+", x):
        return re.search("[a-zA-Z]+", x).group()
    else:
        return "None"
    
data = pd.DataFrame({"data": ["0.00I", "0.01E", "99.99", "0.14F"]})    

data["numeric"] = data["data"].apply(find_numeric)

data["alpha"]= data["data"].apply(find_alpha)

print(data)

output:

data numeric alpha
0  0.00I    0.00     I
1  0.01E    0.01     E
2  99.99   99.99  None
3  0.14F    0.14     F

Answered By: Golden Lion
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.