Why does argparse not accept "–" as argument?
Question:
My script takes -d
, --delimiter
as argument:
parser.add_argument('-d', '--delimiter')
but when I pass it --
as delimiter, it is empty
script.py --delimiter='--'
I know --
is special in argument/parameter parsing, but I am using it in the form --option='--'
and quoted.
Why does it not work?
I am using Python 3.7.3
Here is test code:
#!/bin/python3
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--delimiter')
parser.add_argument('pattern')
args = parser.parse_args()
print(args.delimiter)
When I run it as script --delimiter=-- AAA
it prints empty args.delimiter
.
Answers:
Existing bug report
Patches have been suggested, but it hasn’t been applied. Argparse incorrectly handles ‘–‘ as argument to option
Some simple examples:
In [1]: import argparse
In [2]: p = argparse.ArgumentParser()
In [3]: a = p.add_argument('--foo')
In [4]: p.parse_args(['--foo=123'])
Out[4]: Namespace(foo='123')
The unexpected case:
In [5]: p.parse_args(['--foo=--'])
Out[5]: Namespace(foo=[])
Fully quote passes through – but I won’t get into how you might achieve this via shell call:
In [6]: p.parse_args(['--foo="--"'])
Out[6]: Namespace(foo='"--"')
‘–‘ as separate string:
In [7]: p.parse_args(['--foo','--'])
usage: ipython3 [-h] [--foo FOO]
ipython3: error: argument --foo: expected one argument
...
another example of the double quote:
In [8]: p.parse_args(['--foo','"--"'])
Out[8]: Namespace(foo='"--"')
In _parse_known_args
, the input is scanned and classified as "O" or "A". The ‘–‘ is handled as
# all args after -- are non-options
if arg_string == '--':
arg_string_pattern_parts.append('-')
for arg_string in arg_strings_iter:
arg_string_pattern_parts.append('A')
I think the ‘–‘ are stripped out after that, but I haven’t found that part of the code yet. I’m also not finding were the ‘–foo=…’ version is handled.
I vaguely recall some bug/issues over handling of multiple occurances of ‘–‘. With the migration to github, I’m not following argparse
developements as much as I used to.
edit
get_values
starts with:
def _get_values(self, action, arg_strings):
# for everything but PARSER, REMAINDER args, strip out first '--'
if action.nargs not in [PARSER, REMAINDER]:
try:
arg_strings.remove('--')
except ValueError:
pass
Why that results in a empty list will require more thought and testing.
The ‘=’ is handled in _parse_optional
, which is used during the first scan:
# if the option string before the "=" is present, return the action
if '=' in arg_string:
option_string, explicit_arg = arg_string.split('=', 1)
if option_string in self._option_string_actions:
action = self._option_string_actions[option_string]
return action, option_string, explicit_arg
old bug issues
argparse handling multiple "–" in args improperly
argparse: Allow the use of — to break out of nargs and into subparser
It calls parse_args
which calls parse_known_args
which calls _parse_known_args
.
Then, on line 2078 (or something similar), it does this (inside a while loop going through the string):
start_index = consume_optional(start_index)
which calls the consume_optional
(which makes sense, because this is an optional argument it is parsing right now) defined earlier in the method _parse_known_args
. When given --delimiter='--'
, it will make this action_tuples
:
# if the action expect exactly one argument, we've
# successfully matched the option; exit the loop
elif arg_count == 1:
stop = start_index + 1
args = [explicit_arg]
action_tuples.append((action, args, option_string))
break
##
## The above code gives you the following:
##
action_tuples=[(_StoreAction(option_strings=['-d', '--delimiter'], dest='delimiter', nargs=None, const=None, default=None, type=None, choices=None, help=None, metavar=None), ['--'], '--delimiter')]
That is then iterated to, and is then fed to take_action
on line 2009:
assert action_tuples
for action, args, option_string in action_tuples:
take_action(action, args, option_string)
return stop
The take_action
function will then call self._get_values(action, argument_strings)
on line 1918, which, as mentioned in the answer by @hpaulj, removes the --
. Then, you’re left with the empty list.
This looks like a bug. You should report it.
This code in argparse.py
is the start of _get_values
, one of the primary helper functions for parsing values:
if action.nargs not in [PARSER, REMAINDER]:
try:
arg_strings.remove('--')
except ValueError:
pass
The code receives the --
argument as the single element of a list ['--']
. It tries to remove '--'
from the list, because when using --
as an end-of-options marker, the '--'
string will end up in arg_strings
for one of the _get_values
calls. However, when '--'
is the actual argument value, the code still removes it anyway, so arg_strings
ends up being an empty list instead of a single-element list.
The code then goes through an else-if chain for handling different kinds of argument (branch bodies omitted to save space here):
# optional argument produces a default when not present
if not arg_strings and action.nargs == OPTIONAL:
...
# when nargs='*' on a positional, if there were no command-line
# args, use the default if it is anything other than None
elif (not arg_strings and action.nargs == ZERO_OR_MORE and
not action.option_strings):
...
# single argument or optional argument produces a single value
elif len(arg_strings) == 1 and action.nargs in [None, OPTIONAL]:
...
# REMAINDER arguments convert all values, checking none
elif action.nargs == REMAINDER:
...
# PARSER arguments convert all values, but check only the first
elif action.nargs == PARSER:
...
# SUPPRESS argument does not put anything in the namespace
elif action.nargs == SUPPRESS:
...
# all other types of nargs produce a list
else:
...
This code should go through the 3rd branch,
# single argument or optional argument produces a single value
elif len(arg_strings) == 1 and action.nargs in [None, OPTIONAL]:
but because the argument is missing from arg_strings
, len(arg_strings)
is 0. It instead hits the final case, which is supposed to handle a completely different kind of argument. That branch ends up returning an empty list instead of the '--'
string that should have been returned, which is why args.delimiter
ends up being an empty list instead of a '--'
string.
This bug manifests with positional arguments too. For example,
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('a')
parser.add_argument('b')
args = parser.parse_args(["--", "--", "--"])
print(args)
prints
Namespace(a='--', b=[])
because when _get_values
handles the b
argument, it receives ['--']
as arg_strings
and removes the '--'
. When handling the a
argument, it receives ['--', '--']
, representing one end-of-options marker and one actual --
argument value, and it successfully removes the end-of-options marker, but when handling b
, it removes the actual argument value.
My script takes -d
, --delimiter
as argument:
parser.add_argument('-d', '--delimiter')
but when I pass it --
as delimiter, it is empty
script.py --delimiter='--'
I know --
is special in argument/parameter parsing, but I am using it in the form --option='--'
and quoted.
Why does it not work?
I am using Python 3.7.3
Here is test code:
#!/bin/python3
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--delimiter')
parser.add_argument('pattern')
args = parser.parse_args()
print(args.delimiter)
When I run it as script --delimiter=-- AAA
it prints empty args.delimiter
.
Existing bug report
Patches have been suggested, but it hasn’t been applied. Argparse incorrectly handles ‘–‘ as argument to option
Some simple examples:
In [1]: import argparse
In [2]: p = argparse.ArgumentParser()
In [3]: a = p.add_argument('--foo')
In [4]: p.parse_args(['--foo=123'])
Out[4]: Namespace(foo='123')
The unexpected case:
In [5]: p.parse_args(['--foo=--'])
Out[5]: Namespace(foo=[])
Fully quote passes through – but I won’t get into how you might achieve this via shell call:
In [6]: p.parse_args(['--foo="--"'])
Out[6]: Namespace(foo='"--"')
‘–‘ as separate string:
In [7]: p.parse_args(['--foo','--'])
usage: ipython3 [-h] [--foo FOO]
ipython3: error: argument --foo: expected one argument
...
another example of the double quote:
In [8]: p.parse_args(['--foo','"--"'])
Out[8]: Namespace(foo='"--"')
In _parse_known_args
, the input is scanned and classified as "O" or "A". The ‘–‘ is handled as
# all args after -- are non-options
if arg_string == '--':
arg_string_pattern_parts.append('-')
for arg_string in arg_strings_iter:
arg_string_pattern_parts.append('A')
I think the ‘–‘ are stripped out after that, but I haven’t found that part of the code yet. I’m also not finding were the ‘–foo=…’ version is handled.
I vaguely recall some bug/issues over handling of multiple occurances of ‘–‘. With the migration to github, I’m not following argparse
developements as much as I used to.
edit
get_values
starts with:
def _get_values(self, action, arg_strings):
# for everything but PARSER, REMAINDER args, strip out first '--'
if action.nargs not in [PARSER, REMAINDER]:
try:
arg_strings.remove('--')
except ValueError:
pass
Why that results in a empty list will require more thought and testing.
The ‘=’ is handled in _parse_optional
, which is used during the first scan:
# if the option string before the "=" is present, return the action
if '=' in arg_string:
option_string, explicit_arg = arg_string.split('=', 1)
if option_string in self._option_string_actions:
action = self._option_string_actions[option_string]
return action, option_string, explicit_arg
old bug issues
argparse handling multiple "–" in args improperly
argparse: Allow the use of — to break out of nargs and into subparser
It calls parse_args
which calls parse_known_args
which calls _parse_known_args
.
Then, on line 2078 (or something similar), it does this (inside a while loop going through the string):
start_index = consume_optional(start_index)
which calls the consume_optional
(which makes sense, because this is an optional argument it is parsing right now) defined earlier in the method _parse_known_args
. When given --delimiter='--'
, it will make this action_tuples
:
# if the action expect exactly one argument, we've
# successfully matched the option; exit the loop
elif arg_count == 1:
stop = start_index + 1
args = [explicit_arg]
action_tuples.append((action, args, option_string))
break
##
## The above code gives you the following:
##
action_tuples=[(_StoreAction(option_strings=['-d', '--delimiter'], dest='delimiter', nargs=None, const=None, default=None, type=None, choices=None, help=None, metavar=None), ['--'], '--delimiter')]
That is then iterated to, and is then fed to take_action
on line 2009:
assert action_tuples
for action, args, option_string in action_tuples:
take_action(action, args, option_string)
return stop
The take_action
function will then call self._get_values(action, argument_strings)
on line 1918, which, as mentioned in the answer by @hpaulj, removes the --
. Then, you’re left with the empty list.
This looks like a bug. You should report it.
This code in argparse.py
is the start of _get_values
, one of the primary helper functions for parsing values:
if action.nargs not in [PARSER, REMAINDER]:
try:
arg_strings.remove('--')
except ValueError:
pass
The code receives the --
argument as the single element of a list ['--']
. It tries to remove '--'
from the list, because when using --
as an end-of-options marker, the '--'
string will end up in arg_strings
for one of the _get_values
calls. However, when '--'
is the actual argument value, the code still removes it anyway, so arg_strings
ends up being an empty list instead of a single-element list.
The code then goes through an else-if chain for handling different kinds of argument (branch bodies omitted to save space here):
# optional argument produces a default when not present
if not arg_strings and action.nargs == OPTIONAL:
...
# when nargs='*' on a positional, if there were no command-line
# args, use the default if it is anything other than None
elif (not arg_strings and action.nargs == ZERO_OR_MORE and
not action.option_strings):
...
# single argument or optional argument produces a single value
elif len(arg_strings) == 1 and action.nargs in [None, OPTIONAL]:
...
# REMAINDER arguments convert all values, checking none
elif action.nargs == REMAINDER:
...
# PARSER arguments convert all values, but check only the first
elif action.nargs == PARSER:
...
# SUPPRESS argument does not put anything in the namespace
elif action.nargs == SUPPRESS:
...
# all other types of nargs produce a list
else:
...
This code should go through the 3rd branch,
# single argument or optional argument produces a single value
elif len(arg_strings) == 1 and action.nargs in [None, OPTIONAL]:
but because the argument is missing from arg_strings
, len(arg_strings)
is 0. It instead hits the final case, which is supposed to handle a completely different kind of argument. That branch ends up returning an empty list instead of the '--'
string that should have been returned, which is why args.delimiter
ends up being an empty list instead of a '--'
string.
This bug manifests with positional arguments too. For example,
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('a')
parser.add_argument('b')
args = parser.parse_args(["--", "--", "--"])
print(args)
prints
Namespace(a='--', b=[])
because when _get_values
handles the b
argument, it receives ['--']
as arg_strings
and removes the '--'
. When handling the a
argument, it receives ['--', '--']
, representing one end-of-options marker and one actual --
argument value, and it successfully removes the end-of-options marker, but when handling b
, it removes the actual argument value.