Use variable in Pandas query

Question:

I’m trying to query a Pandas dataframe like this:

        inv = pd.read_csv(infile)
        inv.columns = ['County','Site','Role','Hostname'] 
        clist = inv.County.unique() # Get list of counties
        for county in clist: # for each county
            csub=inv.query('County == county') # create a county subset
            ... do stuff on subset

But I get an error:

pandas.core.computation.ops.UndefinedVariableError: name 'county' is not defined

I’m sure it’s a trivial error, but I can’t figure it out. How do I pass a variable to the query method?

Asked By: Ron Trunk

||

Answers:

According to the documentation, you can reference variables using @:

csub = inv.query('County == @county')
Answered By: Philip Ciunkiewicz

Format String Function

I found another (more generic) solution that might be interesting: The format string function (for examples, see 6.1.3.2. Format examples).

xyz = df.query('ColumnName >= {}'.format(VariableName))

The {} is replaced by VariableName.

f-Strings

In addition, user pciunkiewicz mentioned in a comment another solution using so-called f-strings which were introduced in Python 3.6 (August 2015):

xyz = df.query(f'ColumnName >= {VariableName}')

A more general f-strings example, taken from here:

>>> name = "Eric"
>>> age = 74
>>> f"Hello, {name}. You are {age}."
'Hello, Eric. You are 74.'

PS: I am new to Python.

Answered By: Dr. Manuel Kuehner
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.