Filling null values based on the proportion of the categories in that column
Filling null values based on the proportion of the categories in that column Question: I have the following data col=[‘black’,’black’,’black’,’grey’,’white’,’grey’,’grey’,’nan’,’grey’,’black’,’black’,’red’,’nan’,’nan’,’nan’,’nan’,’black’,’black’,’white’] dd=pd.DataFrame({‘color’:col}) dd.replace(‘nan’,np.NaN,inplace=True) dd.sample(5) Out[1]: color 8 grey 14 NaN 7 NaN 2 black 9 black The following is the proportion of each color in the column dd.color.value_counts(normalize=True) Out[2]: black 0.500000 grey 0.285714 white 0.142857 red …