convert into number of years based on experience in string using python

Question:

I have a column with data like
9 years 3 months 22 days. How to convert it into
=9+(3/12)+(22/365)
=9.31

I want output as 9.31.

How to do the same in python? Please anyone can help me out?

Asked By: Mallesh

||

Answers:

If leaps years are not important extract values by Series.str.extract and count new column by multiple with dictionary and sum values:

df = pd.DataFrame({'col': ['9 years 3 months 22 days',
                           '3 YEARS 6 MONTHS', 
                           '4 YEARS', 
                           '3 YEARS',
                            '3.6']})

numeric = pd.to_numeric(df['col'], errors='coerce')
print (numeric)
y = df['col'].str.extract(r'(?i)(d+)s*year', expand=False).astype(float)
m = df['col'].str.extract(r'(?i)(d+)s*month', expand=False).astype(float).div(12)
d = df['col'].str.extract(r'(?i)(d+)s*day', expand=False).astype(float).div(365)

df['float'] = y.add(m, fill_value=0).add(d, fill_value=0).fillna(numeric)
print (df)
                        col     float
0  9 years 3 months 22 days  9.310274
1          3 YEARS 6 MONTHS  3.500000
2                   4 YEARS  4.000000
3                   3 YEARS  3.000000
4                       3.6  3.600000
Answered By: jezrael

If you want to update the same column then you can use

df = pd.DataFrame({'time': ['9 years 3 months 22 days',
                           '9 years 3 months 21 days',
                           '9 years 3 months 20 days']})
for idx in range(len(df)):
    if df.time[idx] is np.NaN:
        continue
    # Extract values from string
    val = df.time[idx].lower()
    if ' ' in val:
        val = val.split()
        t = {val[i + 1]: int(val[i]) for i in range(0, len(val), 2)}
        # calculate float value upto two decimal places
        float_val = "{:.2f}".format(t.get("years", 0) + (t.get("months", 0)/12) + (t.get("days", 0)/365))
        # update table
        df.time[idx] = float_val
   
Answered By: Daud Khan
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.