Is it good to define a function in a function python

Question:

Here is the code I wrote in a project.

Which one is a better one to write python?

def get_list_of_university_towns():
    ....
    def parse_state(item):
        return re.sub('[edit]', '', item)
    uni_towns['State'] = uni_towns['State'].apply(parse_state)
    return uni_towns

Or:

def parse_state(item):
    return re.sub('[edit]', '', item)
def get_list_of_university_towns():
    ....
    uni_towns['State'] = uni_towns['State'].apply(parse_state)
    return uni_towns

This “parse_state(item)” function is only called once in “get_list_of_university_towns()” and will be never used again. Personally I think define it inside a function will be easier to understand. However, I barely see this kind of codes in other people’s project.

So, how should I write this piece of code?

Asked By: Phil

||

Answers:

Is it Pythonic to write functions inside functions

Yes, it is. Actually, it is more Pythonic to do it inside than outside in order not to pollute the module namespace.

How should I write this piece of code?

The option with the function definition inside the other function works. Another Pythonic way would be to use an anonymous lambda function:

def get_list_of_university_towns():
    ....
    uni_towns['State'] = uni_towns['State'].apply(lambda item: re.sub('[edit]', '', item))
    return uni_towns

Optimizing

As it has been suggested, and now that you said it is a panda’s dataframe which means that the function will be called more than once, you should either compile the expresion or use str.replace() instead of re.sub():

def get_list_of_university_towns():
    ....
    uni_towns['State'] = uni_towns['State'].apply(lambda item: item.replace('[edit]', ''))
    return uni_towns
Answered By: Adirio

Both are fine. The first one is cleaner as you do not pollute module namespace with a name that is not used elsewhere.

And the first form should be slightly faster as parse_state is a local variable when it is called, especially important if it is used within a loop. There is no runtime cost to defining the function inside another function. At runtime, it is a simple assignment.

However, if it is used inside a loop, you should also compile the regular expression in the module scope:

_state_re = re.compile('[edit]')
def get_list_of_university_towns():
    ...
    def parse_state(item):
        return _state_re.sub('', item)
    uni_towns['State'] = uni_towns['State'].apply(parse_state)
    return uni_towns
Answered By: Jonathan Giddy

I agree with Adirio’s answer if a function is needed.

Alternatively, you could consider whether a function is really necessary if it’s only used once.
If it is possible to iterate over uni_towns[‘State’], this would achieve the same and is in my opinion more readable:

def get_list_of_university_towns():
    ....
    for state in uni_towns['State']:
        state = re.sub('[edit]', '', state)
    return uni_towns
Answered By: Lukas Ansteeg

You don’t need to create another function for what you are trying to achieve. Creating a function outside is better so that you can call it in other functions as well rather than limiting the use of it to only to where it is written.

Answered By: python_user
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.