# Where can I find mad (mean absolute deviation) in scipy?

## Question:

It seems scipy once provided a function `mad` to calculate the mean absolute deviation for a set of numbers:

http://projects.scipy.org/scipy/browser/trunk/scipy/stats/models/utils.py?rev=3473

However, I can not find it anywhere in current versions of scipy. Of course it is possible to just copy the old code from repository but I prefer to use scipy’s version. Where can I find it, or has it been replaced or removed?

It looks like scipy.stats.models was removed in august 2008 due to insufficient baking. Development has migrated to `statsmodels`.

It’s not the scipy version, but here’s an implementation of the MAD using masked arrays to ignore bad values:

Edit 2: There’s also a version in astropy here.

[EDIT] Since this keeps on getting downvoted: I know that median absolute deviation is a more commonly-used statistic, but the questioner asked for mean absolute deviation, and here’s how to do it:

``````from numpy import mean, absolute

return mean(absolute(data - mean(data, axis)), axis)
``````

I’m using:

``````from math import fabs

a = [1, 1, 2, 2, 4, 6, 9]

median = sorted(a)[len(a)//2]

for b in a:
``````

For what its worth, I use this for MAD:

``````def mad(arr):
""" Median Absolute Deviation: a "Robust" version of standard deviation.
Indices variabililty of the sample.
https://en.wikipedia.org/wiki/Median_absolute_deviation
"""
arr = np.ma.array(arr).compressed() # should be faster to not use masked arrays.
med = np.median(arr)
return np.median(np.abs(arr - med))
``````

I’m just learning Python and Numpy, but here is the code I wrote to check my 7th grader’s math homework which wanted the M(ean)AD of 2 sets of numbers:

Data in Numpy matrix rows:

``````import numpy as np

>>> a = np.matrix( [ [ 80, 76, 77, 78, 79, 81, 76, 77, 79, 84, 75, 79, 76, 78 ], \
... [ 66, 69, 76, 72, 79, 77, 74, 77, 71, 79, 74, 66, 67, 73 ] ], dtype=float )
>>> matMad = np.mean( np.abs( np.tile( np.mean( a, axis=1 ), ( 1, a.shape[1] ) ) - a ), axis=1 )
matrix([[ 1.81632653],
[ 3.73469388]])
``````

Data in Numpy 1D arrays:

``````>>> a1 = np.array( [ 80, 76, 77, 78, 79, 81, 76, 77, 79, 84, 75, 79, 76, 78 ], dtype=float )
>>> a2 = np.array( [ 66, 69, 76, 72, 79, 77, 74, 77, 71, 79, 74, 66, 67, 73 ], dtype=float )
>>> madA1 = np.mean( np.abs( np.tile( np.mean( a1 ), ( 1, len( a1 ) ) ) - a1 ) )
>>> madA2 = np.mean( np.abs( np.tile( np.mean( a2 ), ( 1, len( a2 ) ) ) - a2 ) )
(1.816326530612244, 3.7346938775510199)
``````

The current version of statsmodels has `mad` in `statsmodels.robust`:

``````>>> import numpy as np
>>> from statsmodels import robust
>>> a = np.matrix( [
...     [ 80, 76, 77, 78, 79, 81, 76, 77, 79, 84, 75, 79, 76, 78 ],
...     [ 66, 69, 76, 72, 79, 77, 74, 77, 71, 79, 74, 66, 67, 73 ]
...  ], dtype=float )
array([ 2.22390333,  5.18910776])
``````

Note that by default this computes the robust estimate of the standard deviation assuming a normal distribution by scaling the result a scaling factor; from `help`:

``````Signature: robust.mad(a,
c=0.67448975019608171,
axis=0,
center=<function median at 0x10ba6e5f0>)
``````

The version in `R` makes a similar normalization. If you don’t want this, obviously just set `c=1`.

(An earlier comment mentioned this being in `statsmodels.robust.scale`. The implementation is in `statsmodels/robust/scale.py` (see github) but the `robust` package does not export `scale`, rather it exports the public functions in `scale.py` explicitly.)

If you enjoy working in Pandas (like I do), it has a useful function for the mean absolute deviation:

``````import pandas as pd
df = pd.DataFrame()
df['a'] = [1, 1, 2, 2, 4, 6, 9]
``````

Output: 2.3673469387755106

Using `numpy` only:

``````def meanDeviation(numpyArray):
mean = np.mean(numpyArray)
f = lambda x: abs(x - mean)
vf = np.vectorize(f)