What is the inverse operation of np.log() and np.diff()?

Question:

I have used the statement dataTrain = np.log(mdataTrain).diff() in my program. I want to reverse the effects of the statement. How can it be done in Python?

Asked By: Sushodhan

||

Answers:

The reverse will involve taking the cumulative sum and then the exponential. Since pd.Series.diff loses information, namely the first value in a series, you will need to store and reuse this data:

np.random.seed(0)

s = pd.Series(np.random.random(10))

print(s.values)

# [ 0.5488135   0.71518937  0.60276338  0.54488318  0.4236548   0.64589411
#   0.43758721  0.891773    0.96366276  0.38344152]

t = np.log(s).diff()
t.iat[0] = np.log(s.iat[0])
res = np.exp(t.cumsum())

print(res.values)

# [ 0.5488135   0.71518937  0.60276338  0.54488318  0.4236548   0.64589411
#   0.43758721  0.891773    0.96366276  0.38344152]
Answered By: jpp

Pandas .diff() and .cumsum() are easy ways to perform finite difference calcs. And as a matter of fact, .diff() is default to .diff(1) – first element of pandas series or dataframe will be a nan; whereas .diff(-1) will loose the last element as nan.

x = pd.Series(np.linspace(.1,2,100)) # uniformly spaced x = mdataTrain
y = np.log(x) # the logarithm function
dx = x.diff() # x finite differences - this vector is a constant
dy = y.diff() # y finite differences
dy_dx_apprx = dy/dx # approximate derivative of logarithm function
dy_dx = 1/x # exact derivative of logarithm function
cs_dy = dy.cumsum() + y[0] # approximate "integral" of approximate "derivative" of y... adding the constant, and reconstructing y
x_invrtd = np.exp(cs_dy) # inverting the log function with exp...
rx = x - x_invrtd # residual values due to computation processess...
abs(rx).sum()

(x-x').sum = 2.0e-14 , two orders above python float EPS [1e-16], will be the sum of residues of the inversion process described below:

x -> exp(cumsum(diff(log(x))) -> x’

The finite difference of log(x) can also be compared with it’s exact derivative, 1/x.
There will be a significant error or residue once the discretization of x is a gross one, only 100 points between .1 and 2.

Answered By: ePuntel