Pandas covert nan values to None in string list before literal_eval and convert back to np.nan

Question:

I have a dataframe with a few series that contain lists of floats that includes nan values. Eg.

s[0] = '[1.21, 1.21, nan, nan, 100]'

These strings I want to convert to lists using literal_eval. When I try I get the error ValueError: malformed node or string on line 1: because as per the docs, nan values cannot be converted as these values are not recognised.

What is the best way of converting the nan values within the string, to None and then converting back to np.nan values after applying literal_eval?

Asked By: apk19

||

Answers:

Solution is like described in a question, but you get Nones instead NaNs:

s.str.replace('nan', 'None', regex=True).apply(ast.literal_eval)

If you need np.nans use custom function:

def convert(x):
    out = []
    for y in x.strip('[]').split(', '):
        try:   
           out.append(ast.literal_eval(y))
        except:
           out.append(np.nan)
    return out

s.apply(convert)

Another idea would be to convert all values to floats:

f = lambda x: [float(y) for y in x.strip('[]').split(', ')]
s.apply(f)

pd.Series([[float(y) for y in x.strip('[]').split(', ')] for x in s], 
              index=s.index)
Answered By: jezrael

Adapting jezrael’s answer, a one-liner to incorporate converting a series of lists nan to None, converting to list using literal_eval and back to nan is:

df['col'] = df['col'].str.replace('nan', 'None', regex=True).apply(ast.literal_eval).apply(lambda row: [np.nan if x is None else x for x in row])
Answered By: apk19