PANDAS Python | Contain specific value in specific position

Question

I’m trying to select just the rows that on the column "Cuenta" contain "05" in the third and fourth position , for example : 51050300 , 51050600

Año	Periodo	Cuenta
2023	1	51050300
2023	2	51053900
2023	1	74359570
2023	2	74452500
2023	6	51050300
2023	7	51050600
2023	7	52351005
2023	7	52353505
2023	7	52159500

I’m using this code:

pattern=r'..05*' 

df[df['Cuenta'].str.contains(pattern)]

But it doesn´t work, How can I do it?

Asked By: Aleja Gallo

||

Source

Answer 1

You have to change your pattern:

pattern = '^..05'  # ^ means from the begin string

>>> df['Cuenta'].astype(str).str.contains(pattern)
0     True
1     True
2    False
3    False
4     True
5     True
6    False
7    False
8    False
Name: Cuenta, dtype: bool

Answered By: Corralien

Answer 2

Or like this:

df[df['Cuenta'].astype(str).str[2:4] == '05']

Output:

    Año  Periodo    Cuenta
0  2023        1  51050300
1  2023        2  51053900
4  2023        6  51050300
5  2023        7  51050600

Answered By: Scott Boston

Answer 3

For fun, assuming an integer column, an arithmetic solution would be:

m = df['Cuenta'].floordiv(10**(np.ceil(np.log10(df['Cuenta'])-1)-3)).mod(100).eq(5)
out = df.loc[m]

Or, if the number of digits is fixed:

m = df['Cuenta']//10000%100 == 5

How it works:

df.assign(n_digits=np.ceil(np.log10(df['Cuenta'])-1)+1,
          first_4=lambda d: d['Cuenta'].floordiv(10**(d['n_digits']-4)),
          digits_3_4=lambda d: d['first_4'].mod(100)
         )

    Año  Periodo    Cuenta  n_digits  first_4  digits_3_4
0  2023        1  51050300       8.0   5105.0         5.0
1  2023        2  51053900       8.0   5105.0         5.0
2  2023        1  74359570       8.0   7435.0        35.0
3  2023        2  74452500       8.0   7445.0        45.0
4  2023        6  51050300       8.0   5105.0         5.0
5  2023        7  51050600       8.0   5105.0         5.0
6  2023        7  52351005       8.0   5235.0        35.0
7  2023        7  52353505       8.0   5235.0        35.0
8  2023        7  52159500       8.0   5215.0        15.0
9  2024        8     12051       5.0   1205.0         5.0

Answered By: mozway

PANDAS Python | Contain specific value in specific position

Question:

Answers: