Use variable in Pandas query
Question:
I’m trying to query a Pandas dataframe like this:
inv = pd.read_csv(infile)
inv.columns = ['County','Site','Role','Hostname']
clist = inv.County.unique() # Get list of counties
for county in clist: # for each county
csub=inv.query('County == county') # create a county subset
... do stuff on subset
But I get an error:
pandas.core.computation.ops.UndefinedVariableError: name 'county' is not defined
I’m sure it’s a trivial error, but I can’t figure it out. How do I pass a variable to the query method?
Answers:
According to the documentation, you can reference variables using @
:
csub = inv.query('County == @county')
Format String Function
I found another (more generic) solution that might be interesting: The format
string function (for examples, see 6.1.3.2. Format examples
).
xyz = df.query('ColumnName >= {}'.format(VariableName))
The {}
is replaced by VariableName
.
f-Strings
In addition, user pciunkiewicz mentioned in a comment another solution using so-called f-strings
which were introduced in Python 3.6 (August 2015):
xyz = df.query(f'ColumnName >= {VariableName}')
A more general f-strings
example, taken from here:
>>> name = "Eric"
>>> age = 74
>>> f"Hello, {name}. You are {age}."
'Hello, Eric. You are 74.'
PS: I am new to Python.
I’m trying to query a Pandas dataframe like this:
inv = pd.read_csv(infile)
inv.columns = ['County','Site','Role','Hostname']
clist = inv.County.unique() # Get list of counties
for county in clist: # for each county
csub=inv.query('County == county') # create a county subset
... do stuff on subset
But I get an error:
pandas.core.computation.ops.UndefinedVariableError: name 'county' is not defined
I’m sure it’s a trivial error, but I can’t figure it out. How do I pass a variable to the query method?
According to the documentation, you can reference variables using @
:
csub = inv.query('County == @county')
Format String Function
I found another (more generic) solution that might be interesting: The format
string function (for examples, see 6.1.3.2. Format examples
).
xyz = df.query('ColumnName >= {}'.format(VariableName))
The {}
is replaced by VariableName
.
f-Strings
In addition, user pciunkiewicz mentioned in a comment another solution using so-called f-strings
which were introduced in Python 3.6 (August 2015):
xyz = df.query(f'ColumnName >= {VariableName}')
A more general f-strings
example, taken from here:
>>> name = "Eric"
>>> age = 74
>>> f"Hello, {name}. You are {age}."
'Hello, Eric. You are 74.'
PS: I am new to Python.