Is it possible to round while substracting 2 columns of a dataframe?
Question:
I am wondering if it is possible to use round(2)
substracting 2 DF column.
DF['D'] = DF['A'] - DF['B']
What is the best way to round this result ?
Answers:
DF['D'] = (DF['A'] - DF['B']).round(2) #ThePyGuy solution
#or
df['D'] = (df['A'].to_numpy() - df['B'].to_numpy()).round(2) #very fast
#or
DF['D'] = DF['A'] - DF['B']
DF['D'] = DF['D'].round(2)
#or
DF['D'] = [round(a - b, 2) for a, b in zip(DF['A'], DF['B'])]
#or
df['D'] = [round(a - b, 2) for a, b in zip(df['A'].to_numpy(), df['B'].to_numpy())]
#or
df.assign(D = lambda x: round(x.A-x.B,2))
#or
df.eval("D = A - B", inplace=True)
df['D'] = df["D"].round(2)
#or
df['D'] = df.apply(lambda x: round(x.A - x.B, 2), axis=1)
benchmarking
df = pd.DataFrame({'A':np.random.random(100000),'B':np.random.random(100000)})
%timeit df['D'] = (df['A'].to_numpy() - df['B'].to_numpy()).round(2) #1 / the fastest
# 1.05 ms ± 59.6 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
%timeit df['D'] = (df['A'] - df['B']).round(2) #2 / the second fastest
# 1.49 ms ± 634 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
%timeit df['D'] = df['A'] - df['B']; df['D'] = df['D'].round(2)
# 2.02 ms ± 843 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit df.eval("D = A - B", inplace=True); df['D'] = df["D"].round(2) #3
# 3.7 ms ± 184 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit df.assign(D = lambda x: round(x.A-x.B,2)) #4
# 7.43 ms ± 727 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit df['D'] = [round(a - b, 2) for a, b in zip(df['A'], df['B'])] #5
# 145 ms ± 59.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit df['D'] = [round(a - b, 2) for a, b in zip(df['A'].to_numpy(), df['B'].to_numpy())] #6
# 611 ms ± 44 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit df['D'] = df.apply(lambda x: round(x.A - x.B, 2), axis=1)
# 3.01 s ± 238 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
I am wondering if it is possible to use round(2)
substracting 2 DF column.
DF['D'] = DF['A'] - DF['B']
What is the best way to round this result ?
DF['D'] = (DF['A'] - DF['B']).round(2) #ThePyGuy solution
#or
df['D'] = (df['A'].to_numpy() - df['B'].to_numpy()).round(2) #very fast
#or
DF['D'] = DF['A'] - DF['B']
DF['D'] = DF['D'].round(2)
#or
DF['D'] = [round(a - b, 2) for a, b in zip(DF['A'], DF['B'])]
#or
df['D'] = [round(a - b, 2) for a, b in zip(df['A'].to_numpy(), df['B'].to_numpy())]
#or
df.assign(D = lambda x: round(x.A-x.B,2))
#or
df.eval("D = A - B", inplace=True)
df['D'] = df["D"].round(2)
#or
df['D'] = df.apply(lambda x: round(x.A - x.B, 2), axis=1)
benchmarking
df = pd.DataFrame({'A':np.random.random(100000),'B':np.random.random(100000)})
%timeit df['D'] = (df['A'].to_numpy() - df['B'].to_numpy()).round(2) #1 / the fastest
# 1.05 ms ± 59.6 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
%timeit df['D'] = (df['A'] - df['B']).round(2) #2 / the second fastest
# 1.49 ms ± 634 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
%timeit df['D'] = df['A'] - df['B']; df['D'] = df['D'].round(2)
# 2.02 ms ± 843 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit df.eval("D = A - B", inplace=True); df['D'] = df["D"].round(2) #3
# 3.7 ms ± 184 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit df.assign(D = lambda x: round(x.A-x.B,2)) #4
# 7.43 ms ± 727 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit df['D'] = [round(a - b, 2) for a, b in zip(df['A'], df['B'])] #5
# 145 ms ± 59.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit df['D'] = [round(a - b, 2) for a, b in zip(df['A'].to_numpy(), df['B'].to_numpy())] #6
# 611 ms ± 44 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit df['D'] = df.apply(lambda x: round(x.A - x.B, 2), axis=1)
# 3.01 s ± 238 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)