Parse http GET and POST parameters from BaseHTTPHandler?

Question:

BaseHTTPHandler from the BaseHTTPServer module doesn’t seem to provide any convenient way to access http request parameters. What is the best way to parse the GET parameters from the path, and the POST parameters from the request body?

Right now, I’m using this for GET:

def do_GET(self):
    parsed_path = urlparse.urlparse(self.path)
    try:
        params = dict([p.split('=') for p in parsed_path[4].split('&')])
    except:
        params = {}

This works for most cases, but I’d like something more robust that handles encodings and cases like empty parameters properly. Ideally, I’d like something small and standalone, rather than a full web framework.

Asked By: ataylor

||

Answers:

You could try the Werkzeug modules, the base Werkzeug library isn’t too large and if needed you can simply extract this bit of code and you’re done.

The url_decode method returns a MultiDict and has encoding support ๐Ÿ™‚

As opposed to the urlparse.parse_qs method the Werkzeug version takes care of:

  • encoding
  • multiple values
  • sort order

If you have no need for these (or in the case of encoding, use Python 3) than feel free to use the built-in solutions.

Answered By: Wolph

Have you investigated using libraries like CherryPy? They provide a much quicker path to handling these things than BaseHTTPServer.

Answered By: Benno

Basic HTTP request parameters support is provided in the CGI module.
The recommended mechanism to handle form data is the cgi.FieldStorage class.

To get at submitted form data, itโ€™s best to use the FieldStorage class. The other classes defined in this module are provided mostly for backward compatibility. Instantiate it exactly once, without arguments. This reads the form contents from standard input or the environment (depending on the value of various environment variables set according to the CGI standard). Since it may consume standard input, it should be instantiated only once.

The FieldStorage instance can be indexed like a Python dictionary. It allows membership testing with the in operator, and also supports the standard dictionary method keys() and the built-in function len(). Form fields containing empty strings are ignored and do not appear in the dictionary; to keep such values, provide a true value for the optional keep_blank_values keyword parameter when creating the FieldStorage instance.

For instance, the following code (which assumes that the Content-Type header and blank line have already been printed) checks that the fields name and addr are both set to a non-empty string:

form = cgi.FieldStorage()
if "name" not in form or "addr" not in form:
    print "<H1>Error</H1>"
    print "Please fill in the name and addr fields."
    return
print "<p>name:", form["name"].value
print "<p>addr:", form["addr"].value
#...further form processing here...
Answered By: gimel

You may want to use urllib.parse:

>>> from urllib.parse import urlparse, parse_qs
>>> url = 'http://example.com/?foo=bar&one=1'
>>> parse_qs(urlparse(url).query)
{'foo': ['bar'], 'one': ['1']}

For Python 2, the module is named urlparse instead of url.parse.

Answered By: zag

Better solution to an old question:

def do_POST(self):
    length = int(self.headers.getheader('content-length'))
    field_data = self.rfile.read(length)
    fields = urlparse.parse_qs(field_data)

This will pull urlencoded POST data from the document content and parse it a dict with proper urldecoding

Answered By: Mike
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.