List of Python Dataframe column values meeting criteria in another dataframe?

Question

I have two dataframes, df1 and df2, which have a common column heading, Name. The Name values are unique within df1 and df2. df1’s Name values are a subset of those in df2; df2 has more rows — about 17,300 — than df1 — about 6,900 — but each Name value in df1 is in df2. I would like to create a list of Name values in df1 that meet certain criteria in other columns of the corresponding rows in df2.

Example:

df1:

	Name	Age	Hair
0	Jim	25	black
1	Mary	58	brown
3	Sue	15	purple

df2:

	Name	Country	phoneOS
0	Shari	GB	Android
1	Jim	US	Android
2	Alain	TZ	iOS
3	Sue	PE	iOS
4	Mary	US	Android

I would like a list of only those Name values in df1 that have df2 Country and phoneOS values of US and Android. The example result would be [Jim, Mary].

I have successfully selected rows within one dataframe that meet multiple criteria in order to copy those rows to a new dataframe. In that case pandas/Python does the iteration over rows internally. I guess I could write a "manual" iteration over the rows of df1 and access df2 on each iteration. I was hoping for a more efficient solution whereby the iteration was handled internally as in the single-dataframe case. But my searches for such a solution have been fruitless.

Asked By: brec

||

Source

Answer 1

try:

df_1.loc[df_1.Name.isin(df_2.loc[df_2.Country.eq('US') & 
                                 df_2.phoneOS.eq('Android'), 'Name']), 'Name']

Result:

0     Jim
1    Mary
Name: Name, dtype: object

if you want the result as a list just add .to_list() at the end

Answered By: 99_m4n

Answer 2

data = df1.merge(df2, on='Name')

data.loc[((data.phoneOS == 'Android') & (data.Country == "US")), 'Name'].values.tolist()

Answered By: MoRe

List of Python Dataframe column values meeting criteria in another dataframe?

Question:

Example:

df1:

df2:

Answers: