How to convert object data type into int64 in python?

Question:

I have a dataset and it has one variable as object data type, i have to convert it to int64 type.

dataframe head

dataframe info

Asked By: Ranjan

||

Answers:

You can try by doing df["Bare Nuclei"].astype(np.int64) but as far as I can see the problem is something else. Pandas first reads all the data to best estimate the data type for each column, then only makes the data frame. So, there must be some entries in the data frame which are not integer types, i.e., they may contain some letters. In that case, also typecasting should give an error. So you need to remove those entries before successfully making the table integer.

Answered By: Rabin Adhikari

I had the same problem with the same dataset.

There are lots of "?" in the data for the ‘bare_nuclei’ column (16) of them in the csv itself you need to use the error handling to drop the rows with the ? in the bare_nuclei column, also as a heads up don’t name ‘class’ column class as that’s a reserved keyword in python and that’s also going to cause problems later.

You can fix this at import using:

missing_values = ["NA","N/a",np.nan,"?"]

l1 = pd.read_csv("../DataSets/Breast cancer dataset/breast-cancer-wisconsin.data",
                 header=None, na_values=missing_values,
                 names=['id','clump_thickness','uniformity_of_cell_size',
                        'uniformity_of_cell_shape', 'marginal_adhesion',
                        'single_epithelial_cell_size', 'bare_nuclei', 'bland_chromatin',
                        'normal_nucleoli', 'mitoses', 'diagnosis'])

l1 = l1.dropna()
Answered By: TeaToCodeConverter
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.