Case insensitive regular expression without re.compile?

Question:

In Python, I can compile a regular expression to be case-insensitive using re.compile:

>>> s = 'TeSt'
>>> casesensitive = re.compile('test')
>>> ignorecase = re.compile('test', re.IGNORECASE)
>>> 
>>> print casesensitive.match(s)
None
>>> print ignorecase.match(s)
<_sre.SRE_Match object at 0x02F0B608>

Is there a way to do the same, but without using re.compile. I can’t find anything like Perl’s i suffix (e.g. m/test/i) in the documentation.

Asked By: Mat

||

Answers:

Pass re.IGNORECASE to the flags param of search, match, or sub:

re.search('test', 'TeSt', re.IGNORECASE)
re.match('test', 'TeSt', re.IGNORECASE)
re.sub('test', 'xxxx', 'Testing', flags=re.IGNORECASE)
Answered By: Michael Haren

You can also perform case insensitive searches using search/match without the IGNORECASE flag (tested in Python 2.7.3):

re.search(r'(?i)test', 'TeSt').group()    ## returns 'TeSt'
re.match(r'(?i)test', 'TeSt').group()     ## returns 'TeSt'
Answered By: aem999

You can also define case insensitive during the pattern compile:

pattern = re.compile('FIle:/+(.*)', re.IGNORECASE)
Answered By: panofish
#'re.IGNORECASE' for case insensitive results short form re.I
#'re.match' returns the first match located from the start of the string. 
#'re.search' returns location of the where the match is found 
#'re.compile' creates a regex object that can be used for multiple matches

 >>> s = r'TeSt'   
 >>> print (re.match(s, r'test123', re.I))
 <_sre.SRE_Match object; span=(0, 4), match='test'>
 # OR
 >>> pattern = re.compile(s, re.I)
 >>> print(pattern.match(r'test123'))
 <_sre.SRE_Match object; span=(0, 4), match='test'>
Answered By: jackotonye

The case-insensitive marker, (?i) can be incorporated directly into the regex pattern:

>>> import re
>>> s = 'This is one Test, another TEST, and another test.'
>>> re.findall('(?i)test', s)
['Test', 'TEST', 'test']
Answered By: Raymond Hettinger

In imports

import re

In run time processing:

RE_TEST = r'test'
if re.match(RE_TEST, 'TeSt', re.IGNORECASE):

It should be mentioned that not using re.compile is wasteful. Every time the above match method is called, the regular expression will be compiled. This is also faulty practice in other programming languages. The below is the better practice.

In app initialization:

self.RE_TEST = re.compile('test', re.IGNORECASE)

In run time processing:

if self.RE_TEST.match('TeSt'):
Answered By: Douglas Daseeco

To perform case-insensitive operations, supply re.IGNORECASE

>>> import re
>>> test = 'UPPER TEXT, lower text, Mixed Text'
>>> re.findall('text', test, flags=re.IGNORECASE)
['TEXT', 'text', 'Text']

and if we want to replace text matching the case…

>>> def matchcase(word):
        def replace(m):
            text = m.group()
            if text.isupper():
                return word.upper()
            elif text.islower():
                return word.lower()
            elif text[0].isupper():
                return word.capitalize()
            else:
                return word
        return replace

>>> re.sub('text', matchcase('word'), test, flags=re.IGNORECASE)
'UPPER WORD, lower word, Mixed Word'
Answered By: this.srivastava

If you would like to replace but still keeping the style of previous str. It is possible.

For example: highlight the string “test asdasd TEST asd tEst asdasd”.

sentence = "test asdasd TEST asd tEst asdasd"
result = re.sub(
  '(test)', 
  r'<b>1</b>',  # 1 here indicates first matching group.
  sentence, 
  flags=re.IGNORECASE)

test asdasd TEST asd tEst asdasd

Answered By: Dat

For Case insensitive regular expression(Regex):
There are two ways by adding in your code:

  1. flags=re.IGNORECASE

    Regx3GList = re.search("(WCDMA:)((d*)(,?))*", txt, re.IGNORECASE)
    
  2. The case-insensitive marker (?i)

    Regx3GList = re.search("**(?i)**(WCDMA:)((d*)(,?))*", txt)
    

(?i) match the remainder of the pattern with the following effective flags: i modifier: insensitive. Case insensitive match (ignores case of [a-zA-Z])

>>> import pandas as pd
>>> s = pd.DataFrame({ 'a': ["TeSt"] })
>>> r = s.replace(to_replace=r'(?i)test', value=r'TEST', regex=True)
>>> print(r)
      a
0  TEST
Answered By: Ax_

I would recommend using (?i:string_region_to_ignore_case) rather than (?i). This method allows one to deal with case sensitivity in a more picky yet clear manner. For instance:

rex = re.findall (r'J(?i:ohn) S(?i:mith)',
      "John smith ; JOHN SMITH; john Smith; John Smith")
#Result:
['JOHN SMITH', 'John Smith']
Answered By: AndreyS Scherbakov