How can I separate columns while reading an excel documents if the original format from excel combines all columns in only one separated by ","?

Question:

I have an excel document that has the information of 3 columns in only one separated by ",". I want to separate the columns during the pd.read_excel(). I tried to use usecols but it did not work. I would like also to name the columnus while calling pd.read_excel().

enter image description here

Asked By: user20716077

||

Answers:

Pandas provide a method to split string around a passed separator/delimiter. After that, the string can be stored as a list in a series or it can also be used to create multiple column data frames from a single separated string. It works similarly to Python’s default split() method but it can only be applied to an individual string. Pandas str.split() method can be applied to a whole series. .str has to be prefixed every time before calling this method to differentiate it from Python’s default function otherwise, it will throw an error. Source

Answered By: Evan Ottinger

Not sure how your .xlsx file is formatted but it looks you should be using pandas.read_csv() instead. Link here.

So maybe something llike pandas.read_csv(filename, sep=',', names=['Name', 'Number', 'Gender'])

Answered By: Isaac Meyer

The text inside your excel is comma sep. One way to do is simply convert that excel to text before reading like so.

your excel

   a,b,c
0  1,2,3
1  4,5,6

Convert to text & read again.

import pandas as pd

with open('file.txt', 'w') as file:
    pd.read_excel('file.xlsx').to_string(file, index=False)


df = pd.read_csv("file.txt", sep = ",")
print(df)

Which prints #

   a  b  c
0  1  2  3
1  4  5  6
Answered By: Bhargav
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.