split columns, extract numbers, and subtract difference

Question:

Community! I have this following df

data = {'exp_lvl': ['5-10 yrs', '3-5 yrs', '1-3 Years']}
df = pd.DataFrame(data)

enter image description here

my goal is something like:

enter image description here

my approach is to 1. replace values , 2. split, 3. append to list, 4. build columns from the appended lists. However i’m stuck in the last step and maybe there is a easier way to approach

thansk so much!!!

Asked By: Victor Castro

||

Answers:

This was not hard. Just mechanical. Did you make any effort?

data = {'exp_lvl': ['5-10 yrs', '3-5 yrs', '1-3 Years']}

data['first'] = []
data['second'] = []
data['difference'] = []
for row in data['exp_lvl']:
    parts = [int(i) for i in row.split(' ')[0].split('-')]
    data['first'].append( parts[0] )
    data['second'].append( parts[1] )
    data['difference'].append( parts[1]-parts[0] )

print(data)
import pandas as pd
df = pd.DataFrame(data)
print(df)

Output:

C:tmp>python x.py
{'exp_lvl': ['5-10 yrs', '3-5 yrs', '1-3 Years'], 'first': [5, 3, 1], 'second': [10, 5, 3], 'difference': [5, 2, 2]}
     exp_lvl  first  second  difference
0   5-10 yrs      5      10           5
1    3-5 yrs      3       5           2
2  1-3 Years      1       3           2

C:tmp>
Answered By: Tim Roberts

Here is another way:

df.join(df['exp_lvl'].str.extractall(r'(d+)')[0]
 .unstack()
 .rename({0:'first',1:'second'},axis=1)
 .astype(float)
 .assign(diff = lambda x: x['second'] - x['first']))

or

(df.join(
    df['exp_lvl'].str.extract(r'(?P<first>d+)-(?P<second>d+)')
    .astype(int)
    .assign(difference = lambda x: x['second'] - x['first'])))

Output:

     exp_lvl  first  second  difference
0   5-10 yrs      5      10           5
1    3-5 yrs      3       5           2
2  1-3 Years      1       3           2
Answered By: rhug123

Use pandas str.split to construct column first and second. Next, compute to get column different

df[['first', 'second']] = df.exp_lvl.str.split('-| ').str[:2].tolist()
df['difference'] = df['second'].astype(int) - df['first'].astype(int)

Out[103]:
     exp_lvl first second  difference
0   5-10 yrs     5     10           5
1    3-5 yrs     3      5           2
2  1-3 Years     1      3           2
Answered By: Andy L.

Another way:

df[['first', 'second']] = df.exp_lvl.str.extract(r'(d+)-(d+)')
df['difference'] = df['second'].astype(int) - df['first'].astype(int)
Answered By: Nk03
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.