Formatting Multiple Columns in a Pandas Dataframe
Question:
I have a dataframe I’m working with that has a large number of columns, and I’m trying to format them as efficiently as possible. I have a bunch of columns that all end in .pct that need to be formatted as percentages, some that end in .cost that need to be formatted as currency, etc.
I know I can do something like this:
cost_calc.style.format({'c.somecolumn.cost' : "${:,.2f}",
'c.somecolumn.cost' : "${:,.2f}",
'e.somecolumn.cost' : "${:,.2f}",
'e.somecolumn.cost' : "${:,.2f}",...
and format each column individually, but I was hoping there was a way to do something similar to this:
cost_calc.style.format({'*.cost' : "${:,.2f}",
'*.pct' : "{:,.2%}",...
Any ideas? Thanks!
Answers:
The first way doesn’t seem bad if you can automatically build that dictionary… you can generate a list of all columns fitting the *.cost description with something like
costcols = [x for x in df.columns.values if x[-5:] == '.cost']
then build your dict like:
formatdict = {}
for costcol in costcols: formatdict[costcol] = "${:,.2f}"
then as you suggested:
cost_calc.style.format(formatdict)
You can easily add the .pct cases similarly. Hope this helps!
I would use regEx with dict generators:
import re
mylist = cost_calc.columns
r = re.compile(r'.*cost')
cost_cols = {key: "${:,.2f}" for key in mylist if r.match(key)}
r = re.compile(r'.*pct')
pct_cols = {key: "${:,.2f}" for key in mylist if r.match(key)}
cost_calc.style.format({**cost_cols, **pct_cols})
note: code for Python 2.7 and 3 onwards
import re
mylist = cost_calc.columns
r = re.compile(r'.*cost')
cost_cols = {key: (lambda x: f'{locale.format_string("%.2f", x, True)} €') for key in mylist if r.match(key)}
r = re.compile(r'.*pct')
pct_cols = {key: "{:.2%}" for key in mylist if r.match(key)}
note: version for euro
I have a dataframe I’m working with that has a large number of columns, and I’m trying to format them as efficiently as possible. I have a bunch of columns that all end in .pct that need to be formatted as percentages, some that end in .cost that need to be formatted as currency, etc.
I know I can do something like this:
cost_calc.style.format({'c.somecolumn.cost' : "${:,.2f}",
'c.somecolumn.cost' : "${:,.2f}",
'e.somecolumn.cost' : "${:,.2f}",
'e.somecolumn.cost' : "${:,.2f}",...
and format each column individually, but I was hoping there was a way to do something similar to this:
cost_calc.style.format({'*.cost' : "${:,.2f}",
'*.pct' : "{:,.2%}",...
Any ideas? Thanks!
The first way doesn’t seem bad if you can automatically build that dictionary… you can generate a list of all columns fitting the *.cost description with something like
costcols = [x for x in df.columns.values if x[-5:] == '.cost']
then build your dict like:
formatdict = {}
for costcol in costcols: formatdict[costcol] = "${:,.2f}"
then as you suggested:
cost_calc.style.format(formatdict)
You can easily add the .pct cases similarly. Hope this helps!
I would use regEx with dict generators:
import re
mylist = cost_calc.columns
r = re.compile(r'.*cost')
cost_cols = {key: "${:,.2f}" for key in mylist if r.match(key)}
r = re.compile(r'.*pct')
pct_cols = {key: "${:,.2f}" for key in mylist if r.match(key)}
cost_calc.style.format({**cost_cols, **pct_cols})
note: code for Python 2.7 and 3 onwards
import re
mylist = cost_calc.columns
r = re.compile(r'.*cost')
cost_cols = {key: (lambda x: f'{locale.format_string("%.2f", x, True)} €') for key in mylist if r.match(key)}
r = re.compile(r'.*pct')
pct_cols = {key: "{:.2%}" for key in mylist if r.match(key)}
note: version for euro