Python: How to winsorize the Mean and Standard Deviation?
Question:
I have a code that I use for multiple different metrics (e.g. Value at Risk, Omega, Sortino, etc.). The formula I use for the average is:
Mean (average):
e = numpy.mean(r)
return numpy.mean(diff) / vol(diff)
Standard deviation:
return numpy.std(returns)
I would like to winsorize the means (and standard deviations) that are used in my calculations. Can anyone advise how to do it? I have found this part but am not really sure how to implement it (if this is it of course):
https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.mstats.winsorize.html
Thanks
Answers:
For a precise answer, an MCVE is needed.
Presuming ‘e’ is an nparray
import numpy as np
import scipy as sp
from scipy.stats.mstats import winsorize
e = np.random.rand(1,100)
print("{}".format(e))
winsorize(e, limits=(0.25,0.25), inplace=True)
print("{}".format(e))
Try this one:
import os
import numpy as np
from scipy.stats.mstats import winsorize
file_location = input("path to file: ")
dirname = os.path.dirname(file_location)
filename = os.path.basename(file_location)
with open(file_location, 'r') as readfile,
open(os.path.join(dirname, 'win_' + filename), 'w') as writefile1,
open(os.path.join(dirname, 'mod_' + filename), 'w') as writefile2:
writefile1.write('adj_mean,adj_stdn')
for idx, line in enumerate(readfile):
print("Reading line# {}...".format(idx))
series = np.array([float(x) for x in line.split(',')])
print("Read {} values...".format(len(series)))
winsorized_series = winsorize(series, limits=[0.10, 0.10])
print("Writing modified series to file...")
writefile2.write(','.join(map(str, winsorized_series)) + 'n')
adj_mean = np.mean(winsorized_series)
adj_std = np.std(winsorized_series)
print("adj mean and std dev...")
writefile1.write("{},{}n".format(adj_mean, adj_std))
import numpy as np
from scipy.stats.mstats import winsorize
def winsorized_mean(x, ql, qr):
return np.mean(winsorize(x, limits=(ql, qr))
def winsorized_std(x, ql, qr):
return np.std(winsorize(x, limits=(ql, qr))
I have a code that I use for multiple different metrics (e.g. Value at Risk, Omega, Sortino, etc.). The formula I use for the average is:
Mean (average):
e = numpy.mean(r)
return numpy.mean(diff) / vol(diff)
Standard deviation:
return numpy.std(returns)
I would like to winsorize the means (and standard deviations) that are used in my calculations. Can anyone advise how to do it? I have found this part but am not really sure how to implement it (if this is it of course):
https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.mstats.winsorize.html
Thanks
For a precise answer, an MCVE is needed.
Presuming ‘e’ is an nparray
import numpy as np
import scipy as sp
from scipy.stats.mstats import winsorize
e = np.random.rand(1,100)
print("{}".format(e))
winsorize(e, limits=(0.25,0.25), inplace=True)
print("{}".format(e))
Try this one:
import os
import numpy as np
from scipy.stats.mstats import winsorize
file_location = input("path to file: ")
dirname = os.path.dirname(file_location)
filename = os.path.basename(file_location)
with open(file_location, 'r') as readfile,
open(os.path.join(dirname, 'win_' + filename), 'w') as writefile1,
open(os.path.join(dirname, 'mod_' + filename), 'w') as writefile2:
writefile1.write('adj_mean,adj_stdn')
for idx, line in enumerate(readfile):
print("Reading line# {}...".format(idx))
series = np.array([float(x) for x in line.split(',')])
print("Read {} values...".format(len(series)))
winsorized_series = winsorize(series, limits=[0.10, 0.10])
print("Writing modified series to file...")
writefile2.write(','.join(map(str, winsorized_series)) + 'n')
adj_mean = np.mean(winsorized_series)
adj_std = np.std(winsorized_series)
print("adj mean and std dev...")
writefile1.write("{},{}n".format(adj_mean, adj_std))
import numpy as np
from scipy.stats.mstats import winsorize
def winsorized_mean(x, ql, qr):
return np.mean(winsorize(x, limits=(ql, qr))
def winsorized_std(x, ql, qr):
return np.std(winsorize(x, limits=(ql, qr))