Select columns from pandas dataframe using multiple conditions on columns in Python

Question:

I have following pandas dataframe (umls)

             CUI      SDUI  SAB  TTY                    STR
325040  C0011405   D003788  MSH   MH   Dental Pulp Diseases
325054  C0011405  10012328  MDR  LLT   Dental pulp disorder
325055  C0011405  10012328  MDR   PT   Dental pulp disorder
325057  C0011405  10044050  MDR   HT  Dental pulp disorders
325061  C0011405   D003788  MSH  DEV          PULP DIS DENT
325062  C0011405   D003788  MSH  DEV          DENT PULP DIS
325063  C0011405   D003788  MSH  DEV          DIS DENT PULP

I would like to filter rows based on certain conditions like:
When SAB = MSH, select TTY= MH
and
when SAB = MDR, select TTY= LLT and PT.

I am expecting below output:

             CUI      SDUI  SAB  TTY                    STR
325040  C0011405   D003788  MSH   MH   Dental Pulp Diseases
325054  C0011405  10012328  MDR  LLT   Dental pulp disorder
325055  C0011405  10012328  MDR   PT   Dental pulp disorder

I am using following lines of code:

umls[(umls['SAB'].isin(['MSH', 'MDR']))] & (umls['TTY'].isin(['MH', 'LLT', 'PT']))]

Any help is highly appreciated

Asked By: rshar

||

Answers:

Remove ()[] for chain both masks:

df = umls[umls['SAB'].isin(['MSH', 'MDR']) & umls['TTY'].isin(['MH', 'LLT', 'PT'])]
print (df)
             CUI      SDUI  SAB  TTY                   STR
325040  C0011405   D003788  MSH   MH  Dental Pulp Diseases
325054  C0011405  10012328  MDR  LLT  Dental pulp disorder
325055  C0011405  10012328  MDR   PT  Dental pulp disorder
Answered By: jezrael
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.