Pandas converting string to numeric – getting invalid literal for int() with base 10 error

Question:

I am trying to convert data from a csv file to a numeric type so that I can find the greatest and least value in each category. This is a short view of the data I am referencing:

Course Grades_Recieved
098321 A,B,D
324323 C,B,D,F
213323 A,B,D,F

I am trying to convert the grades_received to numeric types so that I can create new categories that list the highest grade received and the lowest grade received in each course.

This is my code so far:

import pandas as pd 
df = pd.read_csv('grades.csv')

df.astype({Grades_Recieved':'int64'}).dtypes`

I have tried the code above, I have tried using to_numeric, but I keep getting an error: invalid literal for int() with base 10: ‘A,B,D’ and I am not sure how to fix this. I have also tried getting rid of the ‘,’ but the error remains the same.

Asked By: Steph

||

Answers:

You can’t convert a list of non-numeric strings into int/float, but you can get the desired result doing something like this:

df['Highest_Grade'] = df['Grades_Recieved'].str.split(',').apply(lambda x: min(x))
df['Lowest_Grade'] = df['Grades_Recieved'].str.split(',').apply(lambda x: max(x))
Answered By: Pedro Rocha
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.