Generating a text representation of Python's AST

Question:

With Clang we can do:

clang -cc1 -ast-dump j.c

TranslationUnitDecl 0x7fbcfc00f608 <<invalid sloc>> <invalid sloc>
|-TypedefDecl 0x7fbcfc00fea0 <<invalid sloc>> <invalid sloc> implicit __int128_t '__int128'
| `-BuiltinType 0x7fbcfc00fba0 '__int128'
|-TypedefDecl 0x7fbcfc00ff08 <<invalid sloc>> <invalid sloc> implicit __uint128_t 'unsigned __int128'
| `-BuiltinType 0x7fbcfc00fbc0 'unsigned __int128'
|-TypedefDecl 0x7fbcfc0101b8 <<invalid sloc>> <invalid sloc> implicit __NSConstantString 'struct __NSConstantString_tag'
| `-RecordType 0x7fbcfc00ffd0 'struct __NSConstantString_tag'
|   `-Record 0x7fbcfc00ff58 '__NSConstantString_tag'
|-TypedefDecl 0x7fbcfc010250 <<invalid sloc>> <invalid sloc> implicit __builtin_ms_va_list 'char *'
| `-PointerType 0x7fbcfc010210 'char *'
|   `-BuiltinType 0x7fbcfc00f6a0 'char'
|-TypedefDecl 0x7fbcfc0104f8 <<invalid sloc>> <invalid sloc> implicit __builtin_va_list 'struct __va_list_tag [1]'
| `-ConstantArrayType 0x7fbcfc0104a0 'struct __va_list_tag [1]' 1
|   `-RecordType 0x7fbcfc010320 'struct __va_list_tag'
|     `-Record 0x7fbcfc0102a0 '__va_list_tag'
|-FunctionDecl 0x7fbcfb844200 <j.c:3:1, line:12:1> line:3:5 main 'int ()'
| `-CompoundStmt 0x7fbcfb8447b8 <col:12, line:12:1>
|   |-DeclStmt 0x7fbcfb844350 <line:4:3, col:8>
|   | `-VarDecl 0x7fbcfb8442f0 <col:3, col:7> col:7 used e 'int'
....

Is there a way to do it with Python’s AST?

I found astdump: https://pypi.org/project/astdump/

But it doesn’t print tokens’ literals:

>>> import astdump
>>> astdump.indented('2+3')
Module
  Expr
    BinOp
      Num
      Add
      Num

I need to be able to reconstruct the entire code from the AST.

Asked By: Alex

||

Answers:

Update for Python 3.9+: The ast.dump function in the standard library now has an optional keyword argument indent for pretty-printing of Python ASTs. You pass either an integer for the number of spaces, or a string.


The astpretty library seems to be suitable for your purpose. This library has a pretty-print function pprint which renders the tree structure of an AST including node types and contents in a readable format. You need to combine this with ast.parse from the Python standard library.

The default behaviour of pprint is more verbose, including the line number and column offset of each node, but this can be disabled with the argument show_offsets=False. The usage example below is from the astpretty library’s readme.

>>> astpretty.pprint(ast.parse('x += 5').body[0], show_offsets=False)
AugAssign(
    target=Name(id='x', ctx=Store()),
    op=Add(),
    value=Num(n=5),
)

Note that if you don’t need pretty-printing then the standard library’s ast.dump will work. The output will be somewhat-readable but less so, since it’s not indented to show the tree structure:

>>> print(ast.dump(ast.parse('x += 5').body[0]))
AugAssign(target=Name(id='x', ctx=Store()), op=Add(), value=Num(n=5))
Answered By: kaya3

A seperate library for pretty-printing isn’t required anymore, since ast.dump supports an indent argument. Here is an example:

>>> import ast
>>> print(ast.dump(ast.parse("print('Hello, world!')"), indent=4))
Module(
    body=[
        Expr(
            value=Call(
                func=Name(id='print', ctx=Load()),
                args=[
                    Constant(value='Hello, world!')],
                keywords=[]))],
    type_ignores=[])
Answered By: schlöpp
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.