Python: Argument Parsing Validation Best Practices
Question:
Is it possible when using the argparse module to add validation when parsing arguments?
from argparse import ArgumentParser
parser = ArgumentParser(description='Argument parser for PG restore')
parser.add_argument('--database', dest='database',
default=None, required=False, help='Database to restore')
parser.add_argument('--backup', dest='backup',
required=True, help='Location of the backup file')
parsed_args = parser.parse_args()
Would it be possible, to add a validation check to this argument parser, to make sure the backup file / database exist? Rather than having to add an extra check after this for every parameter such as:
from os.path import exists
if not database_exists(parsed_args.database):
raise DatabaseNotFoundError
if not exists(parsed_args.backup):
raise FileNotFoundError
Answers:
Surely! You just have to specify a custom action as a class, and override __call__(..)
. Link to documentation.
Something like:
import argparse
class FooAction(argparse.Action):
def __call__(self, parser, namespace, values, option_string=None):
if values != "bar":
print("Got value:", values)
raise ValueError("Not a bar!")
setattr(namespace, self.dest, values)
parser = argparse.ArgumentParser()
parser.add_argument("--foo", action=FooAction)
parsed_args = parser.parse_args()
In your particular case, I imagine you’d have DatabaseAction
and FileAction
(or something like that).
The argparse.FileType
is a type
factory class that can open a file, and of course, in the process raise an error if the file does not exist or cannot be created. You could look at its code to see how to create your own class (or function) to test your inputs.
The argument type
parameter is a callable (function, etc) that takes a string, tests it as needed, and converts it (as needed) into the kind of value you want to save to the args
namespace. So it can do any kind of testing you want. If the type
raises an error, then the parser creates an error message (and usage) and exits.
Now whether that’s the right place to do the testing or not depends on your situation. Sometimes opening a file with FileType
is fine, but then you have to close it yourself, or wait for the program to end. You can’t use that open file in a with open(filename) as f:
context. The same could apply to your database. In a complex program you may not want to open or create the file right away.
I wrote for a Python bug/issue a variation on FileType
that created a context
, an object that could be used in the with
context. I also used os
tests to check if the file existed or could be created, without actually doing so. But it required further tricks if the file
was stdin/out
that you don’t want to close. Sometimes trying to do things like this in argparse
is just more work than it’s worth.
Anyways, if you have an easy testing method, you could wrap it in a simple type
function like this:
def database(astring):
from os.path import exists
if not database_exists(astring):
raise ValueError # or TypeError, or `argparse.ArgumentTypeError
return astring
parser.add_argument('--database', dest='database',
type = database,
default=None, required=False, help='Database to restore')
I don’t think it matters a whole lot whether you implement testing like this in the type
or Action
. I think the type
is simpler and more in line with the developer’s intentions.
This is a better version of https://stackoverflow.com/a/37471954/1338570
I could not explain the differences well in a one line comment. Raising a ValueError will cause a traceback in the terminal.
Instead of a raising a ValueErrror, you should call parser.error with a message, as such:
from validators.url import url
class ValidateUrl(Action):
def __call__(self, parser, namespace, values, option_string=None):
for value in values:
if url(value) != True:
parser.error(f"Please enter a valid url. Got: {value}")
setattr(namespace, self.dest, values)
# In your parser code:
parser.add_argument("-u", "--url", dest="url", action=ValidateUrl, help="A url to download")
With this script I can test the proposed alternatives.
import argparse
class ValidateUrl(argparse.Action):
def __call__(self, parser, namespace, values, option_string=None):
if values != "bar":
parser.error(f"Please enter a valid. Got: {values}")
setattr(namespace, self.dest, values)
class FooAction(argparse.Action):
def __call__(self, parser, namespace, values, option_string=None):
if values != "bar":
print("Got value:", values)
#raise ValueError("Not a bar!") # shows a traceback, not usage
raise argparse.ArgumentError(self, 'Not a bar')
setattr(namespace, self.dest, values)
def database(astring):
if astring != "bar":
#raise argparse.ArgumentTypeError("not a bar") # sustom message
raise ValueError('not a bar') # standard error
# error: argument --data: invalid database value: 'xxx'
return astring
parser = argparse.ArgumentParser()
parser.add_argument("--url", action=ValidateUrl)
parser.add_argument("--foo", action = FooAction)
parser.add_argument('--data', type = database)
if __name__=='__main__':
args = parser.parse_args()
print(args)
A working case:
1254:~/mypy$ python3 stack37471636.py --url bar --foo bar --data bar
Namespace(data='bar', foo='bar', url='bar')
errors
usage and exit for the parser.error
case
1255:~/mypy$ python3 stack37471636.py --url xxx
usage: stack37471636.py [-h] [--url URL] [--foo FOO] [--data DATA]
stack37471636.py: error: Please enter a valid. Got: xxx
The standardize message from a ValueError
in the type
function
1256:~/mypy$ python3 stack37471636.py --data xxx
usage: stack37471636.py [-h] [--url URL] [--foo FOO] [--data DATA]
stack37471636.py: error: argument --data: invalid database value: 'xxx'
With ArgumentTypeError
, the message is displayed as is:
1246:~/mypy$ python3 stack37471636.py --url bar --foo bar --data xxx
usage: stack37471636.py [-h] [--url URL] [--foo FOO] [--data DATA]
stack37471636.py: error: argument --data: not a bar
FooAction
with ArgumentError
:
1257:~/mypy$ python3 stack37471636.py --foo xxx
Got value: xxx
usage: stack37471636.py [-h] [--url URL] [--foo FOO] [--data DATA]
stack37471636.py: error: argument --foo: Not a bar
Errors in type
get converted to an ArgumentError
. Note that ArgumentError
identifies the argument
. Calling parser.error
does not.
If FooAction
raises a ValueError
, are regular traceback is displayed, without usage.
1246:~/mypy$ python3 stack37471636.py --url bar --foo xxx --data bar
Got value: xxx
Traceback (most recent call last):
File "stack37471636.py", line 27, in <module>
args = parser.parse_args()
File "/usr/lib/python3.8/argparse.py", line 1780, in parse_args
args, argv = self.parse_known_args(args, namespace)
File "/usr/lib/python3.8/argparse.py", line 1812, in parse_known_args
namespace, args = self._parse_known_args(args, namespace)
File "/usr/lib/python3.8/argparse.py", line 2018, in _parse_known_args
start_index = consume_optional(start_index)
File "/usr/lib/python3.8/argparse.py", line 1958, in consume_optional
take_action(action, args, option_string)
File "/usr/lib/python3.8/argparse.py", line 1886, in take_action
action(self, namespace, argument_values, option_string)
File "stack37471636.py", line 13, in __call__
raise ValueError("Not a bar!")
ValueError: Not a bar!
I believe ArgumentError
and ArgumentTypeError
are the preferred, or at least intended choices. Auto generated errors use these.
Usually parser.error
is used after parsing, resulting for example in
1301:~/mypy$ python3 stack37471636.py
Namespace(data=None, foo=None, url=None)
usage: stack37471636.py [-h] [--url URL] [--foo FOO] [--data DATA]
stack37471636.py: error: not a bar
Is it possible when using the argparse module to add validation when parsing arguments?
from argparse import ArgumentParser
parser = ArgumentParser(description='Argument parser for PG restore')
parser.add_argument('--database', dest='database',
default=None, required=False, help='Database to restore')
parser.add_argument('--backup', dest='backup',
required=True, help='Location of the backup file')
parsed_args = parser.parse_args()
Would it be possible, to add a validation check to this argument parser, to make sure the backup file / database exist? Rather than having to add an extra check after this for every parameter such as:
from os.path import exists
if not database_exists(parsed_args.database):
raise DatabaseNotFoundError
if not exists(parsed_args.backup):
raise FileNotFoundError
Surely! You just have to specify a custom action as a class, and override __call__(..)
. Link to documentation.
Something like:
import argparse
class FooAction(argparse.Action):
def __call__(self, parser, namespace, values, option_string=None):
if values != "bar":
print("Got value:", values)
raise ValueError("Not a bar!")
setattr(namespace, self.dest, values)
parser = argparse.ArgumentParser()
parser.add_argument("--foo", action=FooAction)
parsed_args = parser.parse_args()
In your particular case, I imagine you’d have DatabaseAction
and FileAction
(or something like that).
The argparse.FileType
is a type
factory class that can open a file, and of course, in the process raise an error if the file does not exist or cannot be created. You could look at its code to see how to create your own class (or function) to test your inputs.
The argument type
parameter is a callable (function, etc) that takes a string, tests it as needed, and converts it (as needed) into the kind of value you want to save to the args
namespace. So it can do any kind of testing you want. If the type
raises an error, then the parser creates an error message (and usage) and exits.
Now whether that’s the right place to do the testing or not depends on your situation. Sometimes opening a file with FileType
is fine, but then you have to close it yourself, or wait for the program to end. You can’t use that open file in a with open(filename) as f:
context. The same could apply to your database. In a complex program you may not want to open or create the file right away.
I wrote for a Python bug/issue a variation on FileType
that created a context
, an object that could be used in the with
context. I also used os
tests to check if the file existed or could be created, without actually doing so. But it required further tricks if the file
was stdin/out
that you don’t want to close. Sometimes trying to do things like this in argparse
is just more work than it’s worth.
Anyways, if you have an easy testing method, you could wrap it in a simple type
function like this:
def database(astring):
from os.path import exists
if not database_exists(astring):
raise ValueError # or TypeError, or `argparse.ArgumentTypeError
return astring
parser.add_argument('--database', dest='database',
type = database,
default=None, required=False, help='Database to restore')
I don’t think it matters a whole lot whether you implement testing like this in the type
or Action
. I think the type
is simpler and more in line with the developer’s intentions.
This is a better version of https://stackoverflow.com/a/37471954/1338570
I could not explain the differences well in a one line comment. Raising a ValueError will cause a traceback in the terminal.
Instead of a raising a ValueErrror, you should call parser.error with a message, as such:
from validators.url import url
class ValidateUrl(Action):
def __call__(self, parser, namespace, values, option_string=None):
for value in values:
if url(value) != True:
parser.error(f"Please enter a valid url. Got: {value}")
setattr(namespace, self.dest, values)
# In your parser code:
parser.add_argument("-u", "--url", dest="url", action=ValidateUrl, help="A url to download")
With this script I can test the proposed alternatives.
import argparse
class ValidateUrl(argparse.Action):
def __call__(self, parser, namespace, values, option_string=None):
if values != "bar":
parser.error(f"Please enter a valid. Got: {values}")
setattr(namespace, self.dest, values)
class FooAction(argparse.Action):
def __call__(self, parser, namespace, values, option_string=None):
if values != "bar":
print("Got value:", values)
#raise ValueError("Not a bar!") # shows a traceback, not usage
raise argparse.ArgumentError(self, 'Not a bar')
setattr(namespace, self.dest, values)
def database(astring):
if astring != "bar":
#raise argparse.ArgumentTypeError("not a bar") # sustom message
raise ValueError('not a bar') # standard error
# error: argument --data: invalid database value: 'xxx'
return astring
parser = argparse.ArgumentParser()
parser.add_argument("--url", action=ValidateUrl)
parser.add_argument("--foo", action = FooAction)
parser.add_argument('--data', type = database)
if __name__=='__main__':
args = parser.parse_args()
print(args)
A working case:
1254:~/mypy$ python3 stack37471636.py --url bar --foo bar --data bar
Namespace(data='bar', foo='bar', url='bar')
errors
usage and exit for the parser.error
case
1255:~/mypy$ python3 stack37471636.py --url xxx
usage: stack37471636.py [-h] [--url URL] [--foo FOO] [--data DATA]
stack37471636.py: error: Please enter a valid. Got: xxx
The standardize message from a ValueError
in the type
function
1256:~/mypy$ python3 stack37471636.py --data xxx
usage: stack37471636.py [-h] [--url URL] [--foo FOO] [--data DATA]
stack37471636.py: error: argument --data: invalid database value: 'xxx'
With ArgumentTypeError
, the message is displayed as is:
1246:~/mypy$ python3 stack37471636.py --url bar --foo bar --data xxx
usage: stack37471636.py [-h] [--url URL] [--foo FOO] [--data DATA]
stack37471636.py: error: argument --data: not a bar
FooAction
with ArgumentError
:
1257:~/mypy$ python3 stack37471636.py --foo xxx
Got value: xxx
usage: stack37471636.py [-h] [--url URL] [--foo FOO] [--data DATA]
stack37471636.py: error: argument --foo: Not a bar
Errors in type
get converted to an ArgumentError
. Note that ArgumentError
identifies the argument
. Calling parser.error
does not.
If FooAction
raises a ValueError
, are regular traceback is displayed, without usage.
1246:~/mypy$ python3 stack37471636.py --url bar --foo xxx --data bar
Got value: xxx
Traceback (most recent call last):
File "stack37471636.py", line 27, in <module>
args = parser.parse_args()
File "/usr/lib/python3.8/argparse.py", line 1780, in parse_args
args, argv = self.parse_known_args(args, namespace)
File "/usr/lib/python3.8/argparse.py", line 1812, in parse_known_args
namespace, args = self._parse_known_args(args, namespace)
File "/usr/lib/python3.8/argparse.py", line 2018, in _parse_known_args
start_index = consume_optional(start_index)
File "/usr/lib/python3.8/argparse.py", line 1958, in consume_optional
take_action(action, args, option_string)
File "/usr/lib/python3.8/argparse.py", line 1886, in take_action
action(self, namespace, argument_values, option_string)
File "stack37471636.py", line 13, in __call__
raise ValueError("Not a bar!")
ValueError: Not a bar!
I believe ArgumentError
and ArgumentTypeError
are the preferred, or at least intended choices. Auto generated errors use these.
Usually parser.error
is used after parsing, resulting for example in
1301:~/mypy$ python3 stack37471636.py
Namespace(data=None, foo=None, url=None)
usage: stack37471636.py [-h] [--url URL] [--foo FOO] [--data DATA]
stack37471636.py: error: not a bar