Python regex search and identify the first occurence

Question:

I have a sql string and I need to identify the first occurrence of the database and table name from the sql.

sql = 'select col1, col2, "base" as db_name, "employee" as table_name from base.employee where id is not NULL union select col1, col2, "base" as db_name, "employee" as table_name from base.employee where ts is not NULL'

result_dbname = re.search(',?"(.*)as db_name', sql)

db_name = result_dbname.group(1).replace(""", "")

print(db_name)

Expected result
base
Actual Result
base as db_name, employee as table_name from base.employee where id is not NULL union select col1, col2, base

I would like to capture only the first occurrence

Asked By: Nats

||

Answers:

It’s not super pretty, but you could use:

re.findall(',?"w+" as db_name', sql)[0].split('"')[1]

This just splits the string into three strings:

>>> '"base" as db_name'.split('"')
['', 'base', ' as db_name']

So you just take the first index which in this case is base.

Answered By: kykyi

You can try to use match group:

m = re.match(".*(".*") as db_name, (".*") as table_name.*", sql)
m.groups()
# ('"base"', '"employee"')

Then you can strip the quotation marks.

Answered By: TYZ

If you want the first occurrence, you can start with a non greedy quantifier.

Then use 2 capture groups with a negated character class to not cross matching the double quotes, and just capture what is in between the double quotes.

^.*?"([^"]*)" as db_name, "([^"]*)" as table_nameb

Regex demo | Python demo

import re

pattern = r'^.*?"([^"]*)" as db_name, "([^"]*)" as table_nameb'
s = "select col1, col2, "base" as db_name, "employee" as table_name from base.employee where id is not NULL union select col1, col2, "base" as db_name, "employee" as table_name from base.employee where ts is not NULL"
m = re.match(pattern, s)
if m:
    print(m.groups())

Output

('base', 'employee')
Answered By: The fourth bird
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.