What is the difference of opening a mask image with PIL and cv2?
Question:
Say, I am opening an image file which is a mask image of a specific 3-channels RGB image.
Above is the mask image (msk.png) I’m trying to use in my segmentation model.
When I open this file with PIL library in python with the following line of code :
img1 = Image.open('msk.png')
And after converting img1 to a numpy array and printing it, I get this array :
[[2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 4 4 4 4 4 4 4 4
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 0
0 0 0 0 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3
3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
... (truncated)
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]]
which seems good for my multiclass segmentation task. (it is a grayscale image, and the number of classes to be classified is 24.)
I expected the output array to be the same as the above array when the mask is read by cv2.imread.
However, in the case of img2 such as :
img2 = cv2.imread('msk.png')
It shows a 3-channel output, such as below :
[[[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
...(truncated)
[182 38 155]
[182 38 155]
[182 38 155]
[182 38 155]]]
Why are the two outputs different from each other? Furthermore, how can I make an image to be shown exactly like img1, which has individual labels for each pixels of the grayscale image?
Answers:
Your image is (sensibly and understandably) a palette image because it has a limited number of colours (classes) and as such a palette/indexed image is an efficient way of storing it.
You can see that with exiftool
:
exiftool U3nGE.png
ExifTool Version Number : 12.50
File Name : U3nGE.png
Directory : .
File Size : 12 kB
File Modification Date/Time : 2023:03:22 13:47:33+00:00
File Access Date/Time : 2023:03:22 13:47:34+00:00
...
...
Image Width : 960
Image Height : 736
Bit Depth : 4
Color Type : Palette <--- HERE
Compression : Deflate/Inflate
...
...
Image Size : 960x736
Megapixels : 0.707
OpenCV is for computer vision and no cameras use palette images, so it doesn’t allow you to access the indices and the palette – it just looks up the RGB values through the palette. I think you are more or less obliged to use PIL/Pillow or other library.
Note that you can find the 4 unique colours in your image like this:
import numpy as np
import cv2
im = cv2.imread('U3nGE.png')
colours, counts = np.unique(im.reshape(-1,3), axis=0, return_counts=True)
print(colours, counts)
which yields these 4 BGR colours:
array([[ 0, 252, 124],
[147, 20, 255],
[169, 169, 169],
[182, 38, 155]], dtype=uint8)
and their corresponding counts
(or frequency of occurrence):
array([362286, 33747, 236590, 73937])
but you have lost the correlation between the colours and the class indices. If you know the number of pixels of each class I guess you could re-associate the colours with the classes.
If you use PIL/Pillow you can get the palette like this:
from PIL import Image
import numpy as np
im = Image.open('U3nGE.png')
palette = np.array(im.getpalette(),dtype=np.uint8).reshape((-1,3))
print(palette)
which yields numbers corresponding to what we saw with OpenCV above – except they are padded with greys and in RGB order unlike OpenCV’s BGR order:
array([[155, 38, 182], # palette entry 0
[ 14, 135, 204], # palette entry 1
[124, 252, 0], # palette entry 2
[255, 20, 147], # palette entry 3
[169, 169, 169], # palette entry 4
[ 5, 5, 5], # grey padding
[ 6, 6, 6], # grey padding
[ 7, 7, 7],
[ 8, 8, 8],
[ 9, 9, 9],
[ 10, 10, 10],
[ 11, 11, 11],
[ 12, 12, 12],
[ 13, 13, 13],
[ 14, 14, 14],
[ 15, 15, 15]], dtype=uint8)
Say, I am opening an image file which is a mask image of a specific 3-channels RGB image.
Above is the mask image (msk.png) I’m trying to use in my segmentation model.
When I open this file with PIL library in python with the following line of code :
img1 = Image.open('msk.png')
And after converting img1 to a numpy array and printing it, I get this array :
[[2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 4 4 4 4 4 4 4 4
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 0
0 0 0 0 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3
3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
... (truncated)
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]]
which seems good for my multiclass segmentation task. (it is a grayscale image, and the number of classes to be classified is 24.)
I expected the output array to be the same as the above array when the mask is read by cv2.imread.
However, in the case of img2 such as :
img2 = cv2.imread('msk.png')
It shows a 3-channel output, such as below :
[[[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
[ 0 252 124]
...(truncated)
[182 38 155]
[182 38 155]
[182 38 155]
[182 38 155]]]
Why are the two outputs different from each other? Furthermore, how can I make an image to be shown exactly like img1, which has individual labels for each pixels of the grayscale image?
Your image is (sensibly and understandably) a palette image because it has a limited number of colours (classes) and as such a palette/indexed image is an efficient way of storing it.
You can see that with exiftool
:
exiftool U3nGE.png
ExifTool Version Number : 12.50
File Name : U3nGE.png
Directory : .
File Size : 12 kB
File Modification Date/Time : 2023:03:22 13:47:33+00:00
File Access Date/Time : 2023:03:22 13:47:34+00:00
...
...
Image Width : 960
Image Height : 736
Bit Depth : 4
Color Type : Palette <--- HERE
Compression : Deflate/Inflate
...
...
Image Size : 960x736
Megapixels : 0.707
OpenCV is for computer vision and no cameras use palette images, so it doesn’t allow you to access the indices and the palette – it just looks up the RGB values through the palette. I think you are more or less obliged to use PIL/Pillow or other library.
Note that you can find the 4 unique colours in your image like this:
import numpy as np
import cv2
im = cv2.imread('U3nGE.png')
colours, counts = np.unique(im.reshape(-1,3), axis=0, return_counts=True)
print(colours, counts)
which yields these 4 BGR colours:
array([[ 0, 252, 124],
[147, 20, 255],
[169, 169, 169],
[182, 38, 155]], dtype=uint8)
and their corresponding counts
(or frequency of occurrence):
array([362286, 33747, 236590, 73937])
but you have lost the correlation between the colours and the class indices. If you know the number of pixels of each class I guess you could re-associate the colours with the classes.
If you use PIL/Pillow you can get the palette like this:
from PIL import Image
import numpy as np
im = Image.open('U3nGE.png')
palette = np.array(im.getpalette(),dtype=np.uint8).reshape((-1,3))
print(palette)
which yields numbers corresponding to what we saw with OpenCV above – except they are padded with greys and in RGB order unlike OpenCV’s BGR order:
array([[155, 38, 182], # palette entry 0
[ 14, 135, 204], # palette entry 1
[124, 252, 0], # palette entry 2
[255, 20, 147], # palette entry 3
[169, 169, 169], # palette entry 4
[ 5, 5, 5], # grey padding
[ 6, 6, 6], # grey padding
[ 7, 7, 7],
[ 8, 8, 8],
[ 9, 9, 9],
[ 10, 10, 10],
[ 11, 11, 11],
[ 12, 12, 12],
[ 13, 13, 13],
[ 14, 14, 14],
[ 15, 15, 15]], dtype=uint8)