Python: Typehints for argparse.Namespace objects

Question:

Is there a way to have Python static analyzers (e.g. in PyCharm, other IDEs) pick up on Typehints on argparse.Namespace objects? Example:

parser = argparse.ArgumentParser()
parser.add_argument('--somearg')
parsed = parser.parse_args(['--somearg','someval'])  # type: argparse.Namespace
the_arg = parsed.somearg  # <- Pycharm complains that parsed object has no attribute 'somearg'

If I remove the type declaration in the inline comment, PyCharm doesn’t complain, but it also doesn’t pick up on invalid attributes. For example:

parser = argparse.ArgumentParser()
parser.add_argument('--somearg')
parsed = parser.parse_args(['--somearg','someval'])  # no typehint
the_arg = parsed.somaerg   # <- typo in attribute, but no complaint in PyCharm.  Raises AttributeError when executed.

Any ideas?


Update

Inspired by Austin’s answer below, the simplest solution I could find is one using namedtuples:

from collections import namedtuple
ArgNamespace = namedtuple('ArgNamespace', ['some_arg', 'another_arg'])

parser = argparse.ArgumentParser()
parser.add_argument('--some-arg')
parser.add_argument('--another-arg')
parsed = parser.parse_args(['--some-arg', 'val1', '--another-arg', 'val2'])  # type: ArgNamespace

x = parsed.some_arg  # good...
y = parsed.another_arg  # still good...
z = parsed.aint_no_arg  # Flagged by PyCharm!

While this is satisfactory, I still don’t like having to repeat the argument names. If the argument list grows considerably, it will be tedious updating both locations. What would be ideal is somehow extracting the arguments from the parser object like the following:

parser = argparse.ArgumentParser()
parser.add_argument('--some-arg')
parser.add_argument('--another-arg')
MagicNamespace = parser.magically_extract_namespace()
parsed = parser.parse_args(['--some-arg', 'val1', '--another-arg', 'val2'])  # type: MagicNamespace

I haven’t been able to find anything in the argparse module that could make this possible, and I’m still unsure if any static analysis tool could be clever enough to get those values and not bring the IDE to a grinding halt.

Still searching…


Update 2

Per hpaulj’s comment, the closest thing I could find to the method described above that would “magically” extract the attributes of the parsed object is something that would extract the dest attribute from each of the parser’s _actions.:

parser = argparse.ArgumentParser()
parser.add_argument('--some-arg')
parser.add_argument('--another-arg')
MagicNamespace = namedtuple('MagicNamespace', [act.dest for act in parser._actions])
parsed = parser.parse_args(['--some-arg', 'val1', '--another-arg', 'val2'])  # type: MagicNamespace

But this still does not cause attribute errors to get flagged in static analysis. This is true also true if I pass namespace=MagicNamespace in the parser.parse_args call.

Asked By: Billy

||

Answers:

Consider defining an extension class to argparse.Namespace that provides the type hints you want:

class MyProgramArgs(argparse.Namespace):
    def __init__():
        self.somearg = 'defaultval' # type: str

Then use namespace= to pass that to parse_args:

def process_argv():
    parser = argparse.ArgumentParser()
    parser.add_argument('--somearg')
    nsp = MyProgramArgs()
    parsed = parser.parse_args(['--somearg','someval'], namespace=nsp)  # type: MyProgramArgs
    the_arg = parsed.somearg  # <- Pycharm should not complain
Answered By: aghast

I don’t know anything about how PyCharm handles these typehints, but understand the Namespace code.

argparse.Namespace is a simple class; essentially an object with a few methods that make it easier to view the attributes. And for ease of unittesting it has a __eq__ method. You can read the definition in the argparse.py file.

The parser interacts with the namespace in the most general way possible – with getattr, setattr, hasattr. So you can use almost any dest string, even ones you can’t access with the .dest syntax.

Make sure you don’t confuse the add_argument type= parameter; that’s a function.

Using your own namespace class (from scratch or subclassed) as suggested in the other answer may be the best option. This is described briefly in the documentation. Namespace Object. I haven’t seen this done much, though I’ve suggested it a few times to handle special storage needs. So you’ll have to experiment.

If using subparsers, using a custom Namespace class may break, http://bugs.python.org/issue27859

Pay attention to handling of defaults. The default default for most argparse actions is None. It is handy to use this after parsing to do something special if the user did not provide this option.

 if args.foo is None:
     # user did not use this optional
     args.foo = 'some post parsing default'
 else:
     # user provided value
     pass

That could get in the way type hints. Whatever solution you try, pay attention to the defaults.


A namedtuple won’t work as a Namespace.

First, the proper use of a custom Namespace class is:

nm = MyClass(<default values>)
args = parser.parse_args(namespace=nm)

That is, you initial an instance of that class, and pass it as the parameter. The returned args will be the same instance, with new attributes set by parsing.

Second, a namedtuple can only created, it can’t be changed.

In [72]: MagicSpace=namedtuple('MagicSpace',['foo','bar'])
In [73]: nm = MagicSpace(1,2)
In [74]: nm
Out[74]: MagicSpace(foo=1, bar=2)
In [75]: nm.foo='one'
...
AttributeError: can't set attribute
In [76]: getattr(nm, 'foo')
Out[76]: 1
In [77]: setattr(nm, 'foo', 'one')    # not even with setattr
...
AttributeError: can't set attribute

A namespace has to work with getattr and setattr.

Another problem with namedtuple is that it doesn’t set any kind of type information. It just defines field/attribute names. So there’s nothing for the static typing to check.

While it is easy to get expected attribute names from the parser, you can’t get any expected types.

For a simple parser:

In [82]: parser.print_usage()
usage: ipython3 [-h] [-foo FOO] bar
In [83]: [a.dest for a in parser._actions[1:]]
Out[83]: ['foo', 'bar']
In [84]: [a.type for a in parser._actions[1:]]
Out[84]: [None, None]

The Actions dest is the normal attribute name. But type is not the expected static type of that attribute. It is a function that may or may not convert the input string. Here None means the input string is saved as is.

Because static typing and argparse require different information, there isn’t an easy way to generate one from the other.

I think the best you can do is create your own database of parameters, probably in a dictionary, and create both the Namespace class and the parsesr from that, with your own utility function(s).

Let’s say dd is dictionary with the necessary keys. Then we can create an argument with:

parser.add_argument(dd['short'],dd['long'], dest=dd['dest'], type=dd['typefun'], default=dd['default'], help=dd['help'])

You or someone else will have to come up with a Namespace class definition that sets the default (easy), and static type (hard?) from such a dictionary.

Answered By: hpaulj

Typed argument parser was made for exactly this purpose. It wraps argparse. Your example is implemented as:

from tap import Tap


class ArgumentParser(Tap):
    somearg: str


parsed = ArgumentParser().parse_args(['--somearg', 'someval'])
the_arg = parsed.somearg

Here’s a picture of it in action.
enter image description here

It’s on PyPI and can be installed with: pip install typed-argument-parser

Full disclosure: I’m one of the creators of this library.

Answered By: Jesse

If you are in a situation where you can start from scratch there are interesting solutions like

However, in my case they weren’t an ideal solution because:

  1. I have many existing CLIs based on argparse, and I cannot afford to re-write them all using such args-inferred-from-types approaches.
  2. When inferring args from types it can be tricky to support all advanced CLI features that plain argparse supports.
  3. Re-using common arg definitions in multiple CLIs is often easier in plain imperative argparse compared to alternatives.

Therefore I worked on a tiny library typed_argparse that allows to introduce typed args without much refactoring. The idea is to add a type derived from a special TypedArg class, which then simply wraps the plain argparse.Namespace object:

# Step 1: Add an argument type.
class MyArgs(TypedArgs):
    foo: str
    num: Optional[int]
    files: List[str]


def parse_args(args: List[str] = sys.argv[1:]) -> MyArgs:
    parser = argparse.ArgumentParser()
    parser.add_argument("--foo", type=str, required=True)
    parser.add_argument("--num", type=int)
    parser.add_argument("--files", type=str, nargs="*")
    # Step 2: Wrap the plain argparser result with your type.
    return MyArgs(parser.parse_args(args))


def main() -> None:
    args = parse_args(["--foo", "foo", "--num", "42", "--files", "a", "b", "c"])
    # Step 3: Done, enjoy IDE auto-completion and strong type safety
    assert args.foo == "foo"
    assert args.num == 42
    assert args.files == ["a", "b", "c"]

This approach slightly violates the single-source-of-truth principle, but the library performs a full runtime validation to ensure that the type annotations match the argparse types, and it is just a very simple option to migrate towards typed CLIs.

Answered By: bluenote10

Most of these answers involve using another package to handle the typing. This would be a good idea only if there wasn’t such a simple solution as the one I am about to propose.

Step 1. Type Declarations

First, define the types of each argument in a dataclass like so:

from dataclasses import dataclass

@dataclass
class MyProgramArgs:
    first_var: str
    second_var: int

Step 2. Argument Declarations

Then you can set up your parser however you like with matching arguments. For example:

import argparse

parser = argparse.ArgumentParser("This CLI program uses type hints!")
parser.add_argument("-a", "--first-var")
parser.add_argument("-b", "--another-var", type=int, dest="second_var")

Step 3. Parsing the Arguments

And finally, we parse the arguments in a way that the static type checker will know about the type of each argument:

my_args = MyProgramArgs(**vars(parser.parse_args())

Now the type checker knows that my_args is of type MyProgramArgs so it knows exactly which fields are available and what their type is.

Another way to do it which could be ideal if you have few arguments is as follows.

First make a function that sets up the parser and returns the namespace. For example:

def parse_args() -> argparse.Namespace:
    parser = argparse.ArgumentParser()
    parser.add_argument("-a")
    parser.add_argument("-b", type=int)
    return parser.parse_args()

Then you define a main function which takes the args you declared above individually; like so.

def main(a: str, b: int):
    print("hello world", a, b)

And when you call your main, you do it like this:

if __name__ == "__main__":
    main(**vars(parse_args())

From your main onwards, you’ll have your variables a and b properly recognised by your static type checker, although you won’t have an object any more containing all your arguments, which may be a good or bad thing depending on your use case.

a super solution to just type hint the NameSpace return value of parse_args method.

import argparse
from typing import Type


class NameSpace(argparse.Namespace, Type):
    name: str


class CustomParser(argparse.ArgumentParser):
    def parse_args(self) -> NameSpace:
        return super().parse_args()


parser = CustomParser()

parser.add_argument("--name")

if __name__ == "__main__":
    args = parser.parse_args()
    print(args.name)

PARSED ARGS

Answered By: Jason Leaver