CV2 and Pillow give different image shape

Question:

I have a black and white image that I am loading into python. If I use pillow or cv2 I get two different answers for the dimension of the NumPy array that is created. I understand that channel ordering (RGB vs BGR) is different from openCV and pillow but I don’t think that’s what’s going on here

Is my image 2 dimensions as pillow represents it. Does this mean that openCV duplicates the values into a 3D array?

import cv2
from PIL import Image
import numpy as np

path = 'path/to/file.png'

#using pillow
img = Image.open(path)
img.size #(500,500)
img.mode # L
arr = np.array(img)
arr.shape #(500,500)

#using cv2
image = cv2.imread(path)
image.shape #(500,500,3)

If I run file in a bash terminal

$ file 'path/to/image.png'
path/to/image: PNG image data, 500 x 500, 8-bit grayscale, non-interlaced
Asked By: theastronomist

||

Answers:

The cv2.imread function takes two arguments: filename, and flag. The flag is set to cv2.IMREAD_COLOR (or 1) by default. PIL on the other hand automatically loads your image in grayscale mode, ‘L’. If you want your cv2 image to work the same way, do:

image = cv2.imread(path, cv2.IMREAD_GRAYSCALE)

You can also convert your grayscale PIL image to 3 channels:

img = Image.open(path).convert('RGB')
Answered By: Mercury

cv2 requires explicit specification of grayscale, otherwise it always decompresses as color (BGR).

>>> import cv2
>>> x = cv2.imread('lena.jpeg', cv2.IMREAD_GRAYSCALE)
>>> x.shape
(512, 512, 1)

This is a bit odd, especially if the image is grayscale.

>>> cv2.imwrite('lena_gray.jpeg', x);  # save as grayscale JPEG
>>> x2 = cv2.imread('lena_gray.jpeg')
>>> x2.shape
(512, 512, 3)

Conversion of grayscale to RGB is done by duplication of the value into each color channel. E.g. original white (255) becomes (255,255,255), black (0) becomes (0,0,0), etc.

>>> cv2.imread('lena_gray.jpeg')
array([[[162, 162, 162],
        [161, 161, 161],
        [160, 160, 160],
        ...
        [102, 102, 102],
        [104, 104, 104],
        [107, 107, 107]]], dtype=uint8)
>>> cv2.imread('lena_gray.jpeg', cv2.IMREAD_GRAYSCALE)
array([[ 162, 161, 160, ... ],
       ...
       [ ... , 102, 104, 107]], dtype=uint8)

When working with grayscale, pay attention to third dimension. While PIL returns 2D matrix, cv2 has the third dimension of size one. The latter is of course more practical as it unifies working with RGB and grayscale.

>>> y = np.array(Image.open('lena_gray.jpeg'))
>>> y.shape
(512, 512)
>>>
>>> import cv2
>>> x = cv2.imread('lena.jpeg', cv2.IMREAD_GRAYSCALE)
>>> x.shape
(512, 512, 1)
Answered By: Martin Benes