Calculating probability distribution of an image?

Question:

I want to find the probability distribution of two images so I can calculate KL Divergence.

I’m trying to figure out what probability distribution means in this sense. I’ve converted my images to grayscale, flattened them to a 1d array and plotted them as a histogram with bins = 256

imageone = imgGray.flatten() # array([0.64991451, 0.65775765, 0.66560078, ..., 
imagetwo = imgGray2.flatten()

plt.hist(imageone, bins=256, label = 'image one') 
plt.hist(imagetwo, bins=256, alpha = 0.5, label = 'image two')
plt.legend(loc='upper left')

My next step is to call the ks_2samp function from scikit to calculate the divergence, but I’m unclear what arguments to use.

A previous answer explained that we should take the "take the histogram of the image(in gray scale) and than divide the histogram values by the total number of pixels in the image. This will result in the probability to find a gray value in the image."

Ref: Can Kullback-Leibler be applied to compare two images?

But what do we mean by take the histogram values? How do I ‘take’ these values?

Might be overcomplicating things, but confused by this.

Answers:

The hist function will return 3 values, the first of which is the values (i.e., number counts) in each histogram bin. If you pass the density=True argument to hist, these values will be the probability density in each bin. I.e.,:

prob1, _, _ = plt.hist(imageone, bins=256, density=True, label = 'image one') 
prob2, _, _ = plt.hist(imagetwo, bins=256, density=True, alpha = 0.5, label = 'image two')

You can then calculate the KL divergence using the scipy entropy function:

from scipy.stats import entropy

entropy(prob1, prob2)
Answered By: Matt Pitkin
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.