What is the inverse operation of np.log() and np.diff()?
Question:
I have used the statement dataTrain = np.log(mdataTrain).diff()
in my program. I want to reverse the effects of the statement. How can it be done in Python?
Answers:
The reverse will involve taking the cumulative sum and then the exponential. Since pd.Series.diff
loses information, namely the first value in a series, you will need to store and reuse this data:
np.random.seed(0)
s = pd.Series(np.random.random(10))
print(s.values)
# [ 0.5488135 0.71518937 0.60276338 0.54488318 0.4236548 0.64589411
# 0.43758721 0.891773 0.96366276 0.38344152]
t = np.log(s).diff()
t.iat[0] = np.log(s.iat[0])
res = np.exp(t.cumsum())
print(res.values)
# [ 0.5488135 0.71518937 0.60276338 0.54488318 0.4236548 0.64589411
# 0.43758721 0.891773 0.96366276 0.38344152]
Pandas .diff()
and .cumsum()
are easy ways to perform finite difference calcs. And as a matter of fact, .diff()
is default to .diff(1)
– first element of pandas series or dataframe will be a nan; whereas .diff(-1)
will loose the last element as nan.
x = pd.Series(np.linspace(.1,2,100)) # uniformly spaced x = mdataTrain
y = np.log(x) # the logarithm function
dx = x.diff() # x finite differences - this vector is a constant
dy = y.diff() # y finite differences
dy_dx_apprx = dy/dx # approximate derivative of logarithm function
dy_dx = 1/x # exact derivative of logarithm function
cs_dy = dy.cumsum() + y[0] # approximate "integral" of approximate "derivative" of y... adding the constant, and reconstructing y
x_invrtd = np.exp(cs_dy) # inverting the log function with exp...
rx = x - x_invrtd # residual values due to computation processess...
abs(rx).sum()
(x-x').sum
= 2.0e-14 , two orders above python float EPS [1e-16], will be the sum of residues of the inversion process described below:
x -> exp(cumsum(diff(log(x))) -> x’
The finite difference of log(x) can also be compared with it’s exact derivative, 1/x.
There will be a significant error or residue once the discretization of x is a gross one, only 100 points between .1 and 2.
I have used the statement dataTrain = np.log(mdataTrain).diff()
in my program. I want to reverse the effects of the statement. How can it be done in Python?
The reverse will involve taking the cumulative sum and then the exponential. Since pd.Series.diff
loses information, namely the first value in a series, you will need to store and reuse this data:
np.random.seed(0)
s = pd.Series(np.random.random(10))
print(s.values)
# [ 0.5488135 0.71518937 0.60276338 0.54488318 0.4236548 0.64589411
# 0.43758721 0.891773 0.96366276 0.38344152]
t = np.log(s).diff()
t.iat[0] = np.log(s.iat[0])
res = np.exp(t.cumsum())
print(res.values)
# [ 0.5488135 0.71518937 0.60276338 0.54488318 0.4236548 0.64589411
# 0.43758721 0.891773 0.96366276 0.38344152]
Pandas .diff()
and .cumsum()
are easy ways to perform finite difference calcs. And as a matter of fact, .diff()
is default to .diff(1)
– first element of pandas series or dataframe will be a nan; whereas .diff(-1)
will loose the last element as nan.
x = pd.Series(np.linspace(.1,2,100)) # uniformly spaced x = mdataTrain
y = np.log(x) # the logarithm function
dx = x.diff() # x finite differences - this vector is a constant
dy = y.diff() # y finite differences
dy_dx_apprx = dy/dx # approximate derivative of logarithm function
dy_dx = 1/x # exact derivative of logarithm function
cs_dy = dy.cumsum() + y[0] # approximate "integral" of approximate "derivative" of y... adding the constant, and reconstructing y
x_invrtd = np.exp(cs_dy) # inverting the log function with exp...
rx = x - x_invrtd # residual values due to computation processess...
abs(rx).sum()
(x-x').sum
= 2.0e-14 , two orders above python float EPS [1e-16], will be the sum of residues of the inversion process described below:
x -> exp(cumsum(diff(log(x))) -> x’
The finite difference of log(x) can also be compared with it’s exact derivative, 1/x.
There will be a significant error or residue once the discretization of x is a gross one, only 100 points between .1 and 2.