How to filter values in python dataframe?

Question:

How to filter the dataframe df1 based on column symbol that Starts with . and first digit numeric

    df1
    
      SYMBOL           TYPE
    .1E09UOV      Exchange code
    .2E09UP0      Exchange code
    .AT0013F      Exchange code
    .BT0013G      Exchange code
    .CT002MS      Exchange code
    .DT002MT      Exchange code
    .7T003MT      Exchange code
    .7T004MT      Exchange code
    .7T001MT      Exchange code
    .7T003MT      Exchange code
    
    
    
    Expected output
    
      SYMBOL           TYPE
    .1E09UOV      Exchange code
    .2E09UP0      Exchange code
    .7T003MT      Exchange code
    .7T004MT      Exchange code
    .7T001MT      Exchange code
    .7T003MT      Exchange code

Tried code:

df1.loc[(df1['SYMBOL'].re.sub(r'.d')]
Asked By: SGC

||

Answers:

You can use the following:

df1 = df1[df1['SYMBOL'].str.match('^.[0-9].*')]
  • ^ = start of string
  • . = look for period
  • [0-9] = look for single digit
  • .* = look for zero or more characters

Here is an example showing the full code:

Code:

import pandas as pd

df1 = pd.DataFrame({ 'SYMBOL': ['.1E09UOV', '.2E09UP0', '.AT0013F', '.BT0013G', '.CT002MS', '.DT002MT', '.7T003MT', '.7T004MT', '.7T001MT', '.7T003MT'],
                    'TYPE': ['Exchange code', 'Exchange code', 'Exchange code', 'Exchange code', 'Exchange code', 'Exchange code', 'Exchange code', 'Exchange code', 'Exchange code', 'Exchange code']})

df1 = df1[df1['SYMBOL'].str.match('^.[0-9].*')]

print(df1)

Output:

     SYMBOL           TYPE
0  .1E09UOV  Exchange code
1  .2E09UP0  Exchange code
6  .7T003MT  Exchange code
7  .7T004MT  Exchange code
8  .7T001MT  Exchange code
9  .7T003MT  Exchange code
Answered By: ScottC
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.