List of Python Dataframe column values meeting criteria in another dataframe?

Question:

I have two dataframes, df1 and df2, which have a common column heading, Name. The Name values are unique within df1 and df2. df1’s Name values are a subset of those in df2; df2 has more rows — about 17,300 — than df1 — about 6,900 — but each Name value in df1 is in df2. I would like to create a list of Name values in df1 that meet certain criteria in other columns of the corresponding rows in df2.

Example:

df1:

Name Age Hair
0 Jim 25 black
1 Mary 58 brown
3 Sue 15 purple

df2:

Name Country phoneOS
0 Shari GB Android
1 Jim US Android
2 Alain TZ iOS
3 Sue PE iOS
4 Mary US Android

I would like a list of only those Name values in df1 that have df2 Country and phoneOS values of US and Android. The example result would be [Jim, Mary].

I have successfully selected rows within one dataframe that meet multiple criteria in order to copy those rows to a new dataframe. In that case pandas/Python does the iteration over rows internally. I guess I could write a "manual" iteration over the rows of df1 and access df2 on each iteration. I was hoping for a more efficient solution whereby the iteration was handled internally as in the single-dataframe case. But my searches for such a solution have been fruitless.

Asked By: brec

||

Answers:

try:

df_1.loc[df_1.Name.isin(df_2.loc[df_2.Country.eq('US') & 
                                 df_2.phoneOS.eq('Android'), 'Name']), 'Name']

Result:

0     Jim
1    Mary
Name: Name, dtype: object

if you want the result as a list just add .to_list() at the end

Answered By: 99_m4n
data = df1.merge(df2, on='Name')

data.loc[((data.phoneOS == 'Android') & (data.Country == "US")), 'Name'].values.tolist()
Answered By: MoRe
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.