Find number of lines in python functions without running it
Question:
I am trying to create a tool that identifies the number of lines of code in each function in a given python file. However- I do not want to actually run the code which is required with something like the inspect library suggested in other posts.
Is there a pre-existing tool I can use to get the source code for each function, or even a given function within the plain text of a .py file? Or will I need to create some custom regex/code for this?
Edit : Want to clarify since people are suggesting a few things I have already tried. AST provides me the ability to find the function name- but not the number of lines of the function, nor the source for the function or anything else that would lead me to it (from what I can see). Inspect can do it, but requires running the code (unless I am misunderstanding something).
Answers:
Using AST module you can handle some cases, but there are many ways to have a function defined for a module. This example covers the case where the function is defined on the module itself.
import ast
from other_module import f4 # this will not be available
def get_function_code(source, function_name):
# iterate backwards as we are interested on the last definition
for stmt in ast.parse(source).body[::-1]:
if isinstance(stmt, ast.FunctionDef) and stmt.name == function_name:
# maybe the function has decorators
if len(stmt.decorator_list) != 0:
start = min(d.lineno for d in stmt.decorator_list)
else:
start = stmt.lineno
end = stmt.end_lineno
return 'n'.join(source.split('n')[start-1:end])
Usage example
source = """import ast
class Class(ast.NodeVisitor):
def f1(self):
'''
This is not a module function
'''
def f1():
'''
This is a module function
'''
@decorator
def f2():
'''
this is a function with a decorator
'''
f3 = f1 # this cannot be retrieved without running the code
"""
print(get_function_code(source, 'f1'))
print('-------------------------------')
print(get_function_code(source, 'f2'))
print('-------------------------------')
print(get_function_code(source, 'f3'))
print('-------------------------------')
print(get_function_code(source, 'f4'))
A .py
file is just a simple text file. You can read it as usual:
def count_lines(file_path):
with open(file_path) as f: # Open file
lines = len(f.readlines()) # Read all lines, it's a list so you can use len()
return lines # Return the number of lines
count_lines("./filePath")
- Open the file as usual;
- Read all the lines with
.readlines()
that return a list;
- Use
len()
to get the length of the list that correspond to the total number of lines;
- Return the number of lines.
You can use the ast
module to walk through the parse tree of a Python program. This requires that the program not have any syntax errors, but it will not attempt to execute the program, so runtime errors won’t hinder the process. (In particular, since it doesn’t execute import
statements, it won’t descend into imported modules.)
The following very simple sample assumes that:
- By "lines", you mean the physical lines, as opposed to logical lines. (This code prints the start and end line numbers, but you could change that to
node.end_lineno - node.lineno + 1
if you just want the line count.)
- You want to list each function, including nested functions and class members. This code tracks (and prints)
class
blocks as well as def
blocks, indenting nested objects by four spaces (or the value of the indent
argument).
- You don’t care about async defs. The only reason I left them out was to save space; there’s really nothing special about them, and you would just have to add one more visit method to capture them.
import ast
class ListVisitor(ast.NodeVisitor):
def __init__(self, indent=4):
self.cur = 0
self.ind = indent
def visit_FunctionDef(self, node):
print(f"{'':>{self.cur}}Function {node.name} {node.lineno}->{node.end_lineno}")
self.cur += self.ind
self.generic_visit(node)
self.cur -= self.ind
def visit_ClassDef(self, node):
print(f"{'':>{self.cur}}Class {node.name} {node.lineno}->{node.end_lineno}")
self.cur += self.ind
self.generic_visit(node)
self.cur -= self.ind
if __name__ == "__main__":
from sys import argv
v = ListVisitor()
for fn in argv[1:]:
try:
with open(fn) as source:
v.visit(ast.parse(source.read(), filename=fn, mode='exec'))
except SyntaxError as e:
print(e)
except OSError as e:
print(e)
Here’s what the output looks like when I call it on itself:
$ python3.9 -m listfuncs listfuncs.py
Class ListVisitor 2->15
Function __init__ 3->5
Function visit_FunctionDef 6->10
Function visit_ClassDef 11->15
I am trying to create a tool that identifies the number of lines of code in each function in a given python file. However- I do not want to actually run the code which is required with something like the inspect library suggested in other posts.
Is there a pre-existing tool I can use to get the source code for each function, or even a given function within the plain text of a .py file? Or will I need to create some custom regex/code for this?
Edit : Want to clarify since people are suggesting a few things I have already tried. AST provides me the ability to find the function name- but not the number of lines of the function, nor the source for the function or anything else that would lead me to it (from what I can see). Inspect can do it, but requires running the code (unless I am misunderstanding something).
Using AST module you can handle some cases, but there are many ways to have a function defined for a module. This example covers the case where the function is defined on the module itself.
import ast
from other_module import f4 # this will not be available
def get_function_code(source, function_name):
# iterate backwards as we are interested on the last definition
for stmt in ast.parse(source).body[::-1]:
if isinstance(stmt, ast.FunctionDef) and stmt.name == function_name:
# maybe the function has decorators
if len(stmt.decorator_list) != 0:
start = min(d.lineno for d in stmt.decorator_list)
else:
start = stmt.lineno
end = stmt.end_lineno
return 'n'.join(source.split('n')[start-1:end])
Usage example
source = """import ast
class Class(ast.NodeVisitor):
def f1(self):
'''
This is not a module function
'''
def f1():
'''
This is a module function
'''
@decorator
def f2():
'''
this is a function with a decorator
'''
f3 = f1 # this cannot be retrieved without running the code
"""
print(get_function_code(source, 'f1'))
print('-------------------------------')
print(get_function_code(source, 'f2'))
print('-------------------------------')
print(get_function_code(source, 'f3'))
print('-------------------------------')
print(get_function_code(source, 'f4'))
A .py
file is just a simple text file. You can read it as usual:
def count_lines(file_path):
with open(file_path) as f: # Open file
lines = len(f.readlines()) # Read all lines, it's a list so you can use len()
return lines # Return the number of lines
count_lines("./filePath")
- Open the file as usual;
- Read all the lines with
.readlines()
that return a list; - Use
len()
to get the length of the list that correspond to the total number of lines; - Return the number of lines.
You can use the ast
module to walk through the parse tree of a Python program. This requires that the program not have any syntax errors, but it will not attempt to execute the program, so runtime errors won’t hinder the process. (In particular, since it doesn’t execute import
statements, it won’t descend into imported modules.)
The following very simple sample assumes that:
- By "lines", you mean the physical lines, as opposed to logical lines. (This code prints the start and end line numbers, but you could change that to
node.end_lineno - node.lineno + 1
if you just want the line count.) - You want to list each function, including nested functions and class members. This code tracks (and prints)
class
blocks as well asdef
blocks, indenting nested objects by four spaces (or the value of theindent
argument). - You don’t care about async defs. The only reason I left them out was to save space; there’s really nothing special about them, and you would just have to add one more visit method to capture them.
import ast
class ListVisitor(ast.NodeVisitor):
def __init__(self, indent=4):
self.cur = 0
self.ind = indent
def visit_FunctionDef(self, node):
print(f"{'':>{self.cur}}Function {node.name} {node.lineno}->{node.end_lineno}")
self.cur += self.ind
self.generic_visit(node)
self.cur -= self.ind
def visit_ClassDef(self, node):
print(f"{'':>{self.cur}}Class {node.name} {node.lineno}->{node.end_lineno}")
self.cur += self.ind
self.generic_visit(node)
self.cur -= self.ind
if __name__ == "__main__":
from sys import argv
v = ListVisitor()
for fn in argv[1:]:
try:
with open(fn) as source:
v.visit(ast.parse(source.read(), filename=fn, mode='exec'))
except SyntaxError as e:
print(e)
except OSError as e:
print(e)
Here’s what the output looks like when I call it on itself:
$ python3.9 -m listfuncs listfuncs.py
Class ListVisitor 2->15
Function __init__ 3->5
Function visit_FunctionDef 6->10
Function visit_ClassDef 11->15