Parse URL with a regex in Python

Question:

I want to get the query name and values to be displayed from a URL.
For example, url='http://host:port_num/file/path/file1.html?query1=value1&query2=value2'

From this, parse the query names and its values and to print it.

Asked By: Myjab

||

Answers:

Don’t use a regex! Use urlparse.

>>> import urlparse
>>> urlparse.parse_qs(urlparse.urlparse(url).query)
{'query2': ['value2'], 'query1': ['value1']}
Answered By: teukkam

I agree that it’s best not to use a regular expression and better to use urlparse, but here is my regular expression.

Classes like urlparse were developed specifically to handle all URLs efficiently and are much more reliable than a regular expression is, so make use of them if you can.

>>> x = 'http://www.example.com:8080/abcd/dir/file1.html?query1=value1&query2=value2'
>>> query_pattern='(queryd+)=(w+)'
>>> # query_pattern='(w+)=(w+)'    a more general pattern
>>> re.findall(query_pattern, x)
[('query1', 'value1'), ('query2', 'value2')]
Answered By: jamylak
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.