Write Values found by Pandas Dataframe's .loc function into an array

Question:

I have a google spreadsheet which i managed to load into a pandas dataframe:

Tag1    Tag2    Tag3    Tag4    Tag5    MobileNo
Blue    Yellow  Green   Velvet  Red     12345678
Blue    Yellow  Pink    Grey            234556778
Red     Yellow  Orange  Velvet          4456568
Red     Yellow  Grey    Blue            3454655467

Now i am not really familiar with pandas.
I would need all MobileNo which have a tag in one of the 5 tag columns within their rows to be written into an array.

Like

tag_red_results = ['12345678', '4456568', '3454655467']

How can i accomplish this?

Asked By: Jakob Czapski

||

Answers:

IIUC, use pandas.DataFrame.loc with boolean indexing :

# is the MobileNo tagged as "Red" ?
m = df.filter(like="Tag").eq("Red").any(axis=1)

s = df.loc[m, "MobileNo"]

If a list is needed, then use pandas.Series.to_list :

tag_red_results = s.to_list()
#[12345678, 4456568, 3454655467]

Or, if you need a numpy array, use pandas.Series.to_numpy :

tag_red_results = s.to_numpy()
#array([  12345678,    4456568, 3454655467], dtype=int64)
Answered By: Timeless

To get a list of all the MobileNo values where at least one of the Tag columns in the row has a value:

    import pandas as pd

    # Load the data into a pandas dataframe, specifying the column names
    df = pd.read_csv('file.csv', names=['Tag1', 'Tag2', 'Tag3', 'Tag4', 'Tag5', 'MobileNo'])

    # Create a list of all the MobileNo values where at least one of the Tag columns is not null
    tag_red_results = df[df[['Tag1', 'Tag2', 'Tag3', 'Tag4', 'Tag5']].notnull().any(axis=1)]['MobileNo'].tolist()

    print(tag_red_results)
Answered By: Mohammed Chaaraoui

You can also use melt to flatten your tag columns:

>>> df.melt('MobileNo').loc[lambda x: x['value'] == 'Red', 'MobileNo'].tolist()
[4456568, 3454655467, 12345678]
Answered By: Corralien

Thank you Timeless!

your solution worked perfectly!

Below is my code:

def readColorsDataFromClientSheet(sheetId, tag):
    ss = sheets[sheetId]
    df = ss.find('Colors').to_frame(index_col='Clients')
    tagged = df.filter(like='Tag').eq(tag).any(axis=1)
    mobile_numbers = df.loc[tagged, "MobileNo"].tolist()
    print(mobile_numbers)
return mobile_numbers
Answered By: Jakob Czapski
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.