QQ-Plot in Python using Plotnine

Question:

I want to plot an array of values against a theoretical distribution using a QQ-Plot in Python. Ideally, I want to create the plot using the library Plotnine.

But when I try to create the plot, I’m getting error messages… here’s my code with example data:

from scipy.stats import beta
from plotnine import *
import statsmodels.api as sm
import numpy as np

n = 207
values = -1 + np.random.beta(n/2-1, n/2-1, 100) * 2 # my data
dist = beta(n/2-1, n/2-1, loc = -1, scale = 2) # theoretical distribution

# 1. try:
ggplot(aes(sample = values)) + stat_qq(distribution = dist)
# gives ValueError: Unknown continuous distribution '<scipy.stats._distn_infrastructure.rv_frozen object at 0x0000029755C5C070>'

# 2. try:
params = {'a':n/2-1, 'b':n/2-1, 'loc':-1, 'scale':2}
ggplot(aes(sample = values)) + stat_qq(distribution = 'beta', dparams = params)
# gives TypeError: '>' not supported between instances of 'numpy.ndarray' and 'int'

Does anyone know what I’m doing wrong?

When I try to plot using statsmodels, it seems to work fine:

sm.qqplot(values, dist, line = '45')

enter image description here

As always, any help is highly appreciated!

Asked By: RamsesII

||

Answers:

This is a bug in plotnine, until it is fixed you can try to pass the arguments as a tuple instead of a dict. However, be careful about the positional matching of the arguments (a, b, loc, scale).

Edit

The bug is fixed in the current development version of plotnine and you can use a dict to pass the arguments.

Answered By: starja
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.