Reading excel table into dataframe and spliting values into lists

Question:

I have an excel file containing some columns and, in each column some values to be searched into a database.

example table

I want to read this file (I am using pandas because its a very simple way to read excel files) and extract info into variables:

Desired extract information of each row
Company : Ebay (STR format)
company_name_for_search : [EBAY, eBay, Ebay] (list of strings)
company_register: [4722,4721] (list os ints)

Getting this info, I will run a search script. Some info must be lists because the script will do e search for every item inside the list (for loop).

When I read the excel file, each column is read as a object type in a dataframe, so I couldn’t access each value inside such object.

How to split values, change formats and deal with that?

Asked By: FábioRB

||

Answers:

Your variables are represented as single strings rather than rows of strings and numbers.

Instead of:

company_name register
eBay 4722
eBay 4721
Amazon 9999

You have:

company_name register
ebay,ebay 4722,4721
amazon 9999

You can split each string and then explode the resulting Series containing arrays to get a long form DataFrame.

import pandas as pd

mess = pd.DataFrame(
    {
        "letters": ["A,B", "C,D", "E,F,G,H"],
        "nums": ["100,200", "300,400", "500, 600, 700, 800"],
    }
)

mess = mess.apply(lambda col: col.str.split(",").explode())
Answered By: Joshua Megnauth
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.