Python Fill empty cells with average for row's group
Question:
I basically have a csv file with two columns: PLACE (string) and quantity (int). Some of my quantity rows are empty and I want to fill them with the AVERAGE for the group of PLACE.
For example:
PLACE, QUANTITY
AUSTRALIA, 4
AUSTRALIA, 2
USA, 3
AUSTRALIA,
you can see that one ‘AUSTRALIA’ has no associated qty. I want that row of ‘AUS’ to have the average of all the ‘AUS’ rows that do have a value. How would I do this in python? Ive tried this below, but it doesnt do anything. Maybe because I filled the NAs with NaN its not filling?
import pandas as pd
import csv
# READ THE DATA FILES
csv_file = open('MY_CSV.csv')
df = pd.read_csv(csv_file)
# fill all NAs and replace with the average of that PLACE
AverageReplace = df.groupby('PLACE')['QUANTITY'].mean()
df['QUANTITY'].fillna(AverageReplace, inplace=True)
df.head()
Answers:
y=a.fillna(0).groupby('PLACE').agg('mean')['Quantity']
a['Quantity'] = a[['PLACE','Quantity']].apply(lambda x: y[x['PLACE']] if np.isnan(x['Quantity']) else x['Quantity'],axis=1)
Try this. it works on my system
below is another way of doing that.
import numpy as np
import pandas as pd
data = {'Place':['Australia', 'Australia', 'USA', 'Australia'], 'Quantity':[4,2,3,np.nan] }
df = pd.DataFrame(data)
df['Quantity'] = df['Quantity'].fillna(df[df['Place']=='Australia']['Quantity'].mean())
I basically have a csv file with two columns: PLACE (string) and quantity (int). Some of my quantity rows are empty and I want to fill them with the AVERAGE for the group of PLACE.
For example:
PLACE, QUANTITY
AUSTRALIA, 4
AUSTRALIA, 2
USA, 3
AUSTRALIA,
you can see that one ‘AUSTRALIA’ has no associated qty. I want that row of ‘AUS’ to have the average of all the ‘AUS’ rows that do have a value. How would I do this in python? Ive tried this below, but it doesnt do anything. Maybe because I filled the NAs with NaN its not filling?
import pandas as pd
import csv
# READ THE DATA FILES
csv_file = open('MY_CSV.csv')
df = pd.read_csv(csv_file)
# fill all NAs and replace with the average of that PLACE
AverageReplace = df.groupby('PLACE')['QUANTITY'].mean()
df['QUANTITY'].fillna(AverageReplace, inplace=True)
df.head()
y=a.fillna(0).groupby('PLACE').agg('mean')['Quantity']
a['Quantity'] = a[['PLACE','Quantity']].apply(lambda x: y[x['PLACE']] if np.isnan(x['Quantity']) else x['Quantity'],axis=1)
Try this. it works on my system
below is another way of doing that.
import numpy as np
import pandas as pd
data = {'Place':['Australia', 'Australia', 'USA', 'Australia'], 'Quantity':[4,2,3,np.nan] }
df = pd.DataFrame(data)
df['Quantity'] = df['Quantity'].fillna(df[df['Place']=='Australia']['Quantity'].mean())