Regular expression to find function calls in a file with Python regular expression?
Question:
I would like a regular expression, of which I will use with the Python re module, that will look for python function calls within a python file, but there will be caveats around the function calls that I’m looking for.
- The function calls will have a single, specific name.
- The function calls may be chained, but will only have one chained call that will always have the same name.
- The first function will always take a single string argument.
- The chained function, however, might take arbitrary arguments (this is the one that worries me).
Here are example usages of the functions I want to look for within a file:
# Simple function call.
f("_key")
# The chained function call, in the simplest format (no args).
f("_key").g()
# The chained function call with simple arguments.
f("_key").g("hello", 1337)
# The chained function call with possible, more complex arguments
f("_key").g(obj.blah(), {"dog":"cat"})
# And then the possibility for long function calls to extend over one line
f("_key").g(
"dogs",
"cats",
{"living":"together"})
And the usual disclaimer: I did a search for this, and the questions were close to mine, but I’m wondering if my needs are constrained enough to get around the “regular vs. irregular” language problem. This is what I get for not being a computer science major and being afraid of regular expressions.
Answers:
This should do what you want:
[a-zA-Z]+([^)]*)(.[^)]*))?
FWIW, here is an excerpt from Grammar/Grammar:
decorator: '@' dotted_name [ '(' [arglist] ')' ] NEWLINE
trailer: '(' [arglist] ')' | '[' subscriptlist ']' | '.' NAME
power: atom trailer* ['**' factor]
atom: ('(' [yield_expr|testlist_comp] ')' |
'[' [listmaker] ']' |
'{' [dictorsetmaker] '}' |
'`' testlist1 '`' |
NAME | NUMBER | STRING+)
arglist: (argument ',')* (argument [',']
|'*' test (',' argument)* [',' '**' test]
|'**' test)
These are the cases that need to be handled by a regex to capture all function calls without any false positives.
Instead of regexes, perhaps it would be better to leverage one of the toolsets that come with the Python standard library:
This is, I think more adapted (more precisely in the case where there is a function in another function) :
w+([^)]*){1,}(.[^)]*))?
I would like a regular expression, of which I will use with the Python re module, that will look for python function calls within a python file, but there will be caveats around the function calls that I’m looking for.
- The function calls will have a single, specific name.
- The function calls may be chained, but will only have one chained call that will always have the same name.
- The first function will always take a single string argument.
- The chained function, however, might take arbitrary arguments (this is the one that worries me).
Here are example usages of the functions I want to look for within a file:
# Simple function call.
f("_key")
# The chained function call, in the simplest format (no args).
f("_key").g()
# The chained function call with simple arguments.
f("_key").g("hello", 1337)
# The chained function call with possible, more complex arguments
f("_key").g(obj.blah(), {"dog":"cat"})
# And then the possibility for long function calls to extend over one line
f("_key").g(
"dogs",
"cats",
{"living":"together"})
And the usual disclaimer: I did a search for this, and the questions were close to mine, but I’m wondering if my needs are constrained enough to get around the “regular vs. irregular” language problem. This is what I get for not being a computer science major and being afraid of regular expressions.
This should do what you want:
[a-zA-Z]+([^)]*)(.[^)]*))?
FWIW, here is an excerpt from Grammar/Grammar:
decorator: '@' dotted_name [ '(' [arglist] ')' ] NEWLINE
trailer: '(' [arglist] ')' | '[' subscriptlist ']' | '.' NAME
power: atom trailer* ['**' factor]
atom: ('(' [yield_expr|testlist_comp] ')' |
'[' [listmaker] ']' |
'{' [dictorsetmaker] '}' |
'`' testlist1 '`' |
NAME | NUMBER | STRING+)
arglist: (argument ',')* (argument [',']
|'*' test (',' argument)* [',' '**' test]
|'**' test)
These are the cases that need to be handled by a regex to capture all function calls without any false positives.
Instead of regexes, perhaps it would be better to leverage one of the toolsets that come with the Python standard library:
This is, I think more adapted (more precisely in the case where there is a function in another function) :
w+([^)]*){1,}(.[^)]*))?