# How do I get a lognormal distribution in Python with Mu and Sigma?

## Question:

I have been trying to get the result of a lognormal distribution using Scipy. I already have the Mu and Sigma, so I don’t need to do any other prep work. If I need to be more specific (and I am trying to be with my limited knowledge of stats), I would say that I am looking for the cumulative function (cdf under Scipy). The problem is that I can’t figure out how to do this with just the mean and standard deviation on a scale of 0-1 (ie the answer returned should be something from 0-1). I’m also not sure which method from **dist**, I should be using to get the answer. I’ve tried reading the documentation and looking through SO, but the relevant questions (like this and this) didn’t seem to provide the answers I was looking for.

Here is a code sample of what I am working with. Thanks.

```
from scipy.stats import lognorm
stddev = 0.859455801705594
mean = 0.418749176686875
total = 37
dist = lognorm.cdf(total,mean,stddev)
```

**UPDATE:**

So after a bit of work and a little research, I got a little further. But I still am getting the wrong answer. The new code is below. According to R and Excel, the result should be *.7434*, but that’s clearly not what is happening. Is there a logic flaw I am missing?

```
dist = lognorm([1.744],loc=2.0785)
dist.cdf(25) # yields=0.96374596, expected=0.7434
```

**UPDATE 2:**

Working lognorm implementation which yields the correct **0.7434** result.

```
def lognorm(self,x,mu=0,sigma=1):
a = (math.log(x) - mu)/math.sqrt(2*sigma**2)
p = 0.5 + 0.5*math.erf(a)
return p
lognorm(25,1.744,2.0785)
> 0.7434
```

## Answers:

It sounds like you want to instantiate a “frozen” distribution from known parameters. In your example, you could do something like:

```
from scipy.stats import lognorm
stddev = 0.859455801705594
mean = 0.418749176686875
dist=lognorm([stddev],loc=mean)
```

which will give you a lognorm distribution object with the mean and standard deviation you specify. You can then get the pdf or cdf like this:

```
import numpy as np
import pylab as pl
x=np.linspace(0,6,200)
pl.plot(x,dist.pdf(x))
pl.plot(x,dist.cdf(x))
```

Is this what you had in mind?

I know this is a bit late (almost one year!) but I’ve been doing some research on the lognorm function in scipy.stats. A lot of folks seem confused about the input parameters, so I hope to help these people out. The example above is almost correct, but I found it strange to set the mean to the location (“loc”) parameter – this signals that the cdf or pdf doesn’t ‘take off’ until the value is greater than the mean. Also, the mean and standard deviation arguments should be in the form exp(Ln(mean)) and Ln(StdDev), respectively.

Simply put, the arguments are (x, shape, loc, scale), with the parameter definitions below:

loc – No equivalent, this gets subtracted from your data so that 0 becomes the infimum of the range of the data.

scale – exp μ, where μ is the mean of the log of the variate. (When fitting, typically you’d use the sample mean of the log of the data.)

shape – the standard deviation of the log of the variate.

I went through the same frustration as most people with this function, so I’m sharing my solution. Just be careful because the explanations aren’t very clear without a compendium of resources.

For more information, I found these sources helpful:

- http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.lognorm.html#scipy.stats.lognorm
- https://stats.stackexchange.com/questions/33036/fitting-log-normal-distribution-in-r-vs-scipy

And here is an example, taken from @serv-inc ‘s answer, posted on this page here:

```
import math
from scipy import stats
# standard deviation of normal distribution
sigma = 0.859455801705594
# mean of normal distribution
mu = 0.418749176686875
# hopefully, total is the value where you need the cdf
total = 37
frozen_lognorm = stats.lognorm(s=sigma, scale=math.exp(mu))
frozen_lognorm.cdf(total) # use whatever function and value you need here
```

Even more late, but in case it’s helpful to anyone else: I found that the Excel’s

```
LOGNORM.DIST(x,Ln(mean),standard_dev,TRUE)
```

provides the same results as python’s

```
from scipy.stats import lognorm
lognorm.cdf(x,sigma,0,mean)
```

Likewise, Excel’s

```
LOGNORM.DIST(x,Ln(mean),standard_dev,FALSE)
```

seems equivalent to Python’s

```
from scipy.stats import lognorm
lognorm.pdf(x,sigma,0,mean).
```

@lucas’ answer has the usage down pat. As a code example, you could use

```
import math
from scipy import stats
# standard deviation of normal distribution
sigma = 0.859455801705594
# mean of normal distribution
mu = 0.418749176686875
# hopefully, total is the value where you need the cdf
total = 37
frozen_lognorm = stats.lognorm(s=sigma, scale=math.exp(mu))
frozen_lognorm.cdf(total) # use whatever function and value you need here
```

If you read this and just want a function with the behaviour similar to `lnorm`

in R. Well, then relieve yourself from violent anger and use numpy’s `numpy.random.lognormal`

.

```
from math import exp
from scipy import stats
def lognorm_cdf(x, mu, sigma):
shape = sigma
loc = 0
scale = exp(mu)
return stats.lognorm.cdf(x, shape, loc, scale)
x = 25
mu = 2.0785
sigma = 1.744
p = lognorm_cdf(x, mu, sigma) #yields the expected 0.74341
```

Similar to Excel and R, The **lognorm_cdf** function above parameterizes the CDF for the log-normal distribution using *mu* and *sigma*.

Although SciPy uses *shape*, *loc* and *scale* parameters to characterize its probability distributions, for the log-normal distribution I find it slightly easier to think of these parameters at the variable level rather than at the distribution level. Here’s what I mean…

A log-normal variable *X* is related to a normal variable *Z* as follows:

```
X = exp(mu + sigma * Z) #Equation 1
```

which is the same as:

```
X = exp(mu) * exp(Z)**sigma #Equation 2
```

This can be sneakily re-written as follows:

```
X = exp(mu) * exp(Z-Z0)**sigma #Equation 3
```

where *Z0* = 0. This equation is of the form:

```
f(x) = a * ( (x-x0) ** b ) #Equation 4
```

If you can visualize equations in your head it should be clear that the scale, shape and location parameters in Equation 4 are: *a*, *b* and *x0*, respectively. This means that in Equation 3 the scale, shape and location parameters are: *exp(mu)*, *sigma* and zero, respectfully.

If you can’t visualize that very clearly, let’s rewrite Equation 2 as a function:

```
f(Z) = exp(mu) * exp(Z)**sigma #(same as Equation 2)
```

and then look at the effects of *mu* and *sigma* on *f(Z)*. The figure below holds *sigma* constant and varies *mu*. You should see that *mu* vertically scales *f(Z)*. However, it does so in a nonlinear manner; the effect of changing *mu* from 0 to 1 is smaller than the effect of changing *mu* from 1 to 2. From Equation 2 we see that *exp(mu)* is actually the linear scaling factor. Hence SciPy’s “scale” is *exp(mu)*.

The next figure holds *mu* constant and varies *sigma*. You should see that the shape of *f(Z)* changes. That is, *f(Z)* has a constant value when *Z*=0 and *sigma* affects how quickly *f(Z)* curves away from the horizontal axis. Hence SciPy’s “shape” is *sigma*.

#### Known mean and stddev of the lognormal distribution

In case someone is looking for it, here is a solution for getting the `scipy.stats.lognorm`

distribution if the mean `mu`

and standard deviation `sigma`

**of the lognormal distribution** are known. In this case we have to calculate the `stats.lognorm`

parameters from the known `mu`

and `sigma`

like so:

```
import numpy as np
from scipy import stats
mu = 10
sigma = 3
a = 1 + (sigma / mu) ** 2
s = np.sqrt(np.log(a))
scale = mu / np.sqrt(a)
```

This was obtained by looking into the implementation of the variance and mean calculations in the `stats.lognorm.stats`

method and essentially reversing it (solving for the input).

Then we can initialize the frozen distribution instance

```
distr = stats.lognorm(s, 0, scale)
# generate some randomvals
randomvals = distr.rvs(1_000_000)
# calculate mean and variance using the dedicated method
mu_stats, var_stats = distr.stats("mv")
```

Compare means and stddevs from input, randomvals and analytical solution from `distr.stats`

:

```
print(f"""
Mean Std
----------------------------
Input: {mu:6.2f} {sigma:6.2f}
Randomvals: {randomvals.mean():6.2f} {randomvals.std():6.2f}
lognorm.stats: {mu_stats:6.2f} {np.sqrt(var_stats):6.2f}
""")
Mean Std
----------------------------
Input: 10.00 3.00
Randomvals: 10.00 3.00
lognorm.stats: 10.00 3.00
```

Plot PDF from `stats.lognorm`

and histogram of the random values:

```
import holoviews as hv
hv.extension('bokeh')
x = np.linspace(0, 30, 301)
counts, _ = np.histogram(randomvals, bins=x)
counts = counts / counts.sum() / (x[1] - x[0])
(hv.Histogram((counts, x))
* hv.Curve((x, distr.pdf(x))).opts(color="r").opts(width=900))
```