Why does Python Pillow swap x and y axis

Question:

I tried this example from GeeksForGeeks where you create a blue picture. And I wanted to try something more out of it, so I wanted to change a single pixel from blue to red. Meanwhile, I did that successfully, I notice that position of the red pixel is reversed. Instead of [1,4] i got [4,1] pixel to turn red, I noticed the same problem of switching x and y with function Image.frombytes. I tried reading the class(PixelAccess) documentation but haven’t found anything. I am using Python 3.10.6 and 9.2.0 the latest version of PIL which makes this post not relevant.

The easiest solution is to switch x and y in code, but I can’t find a reason why are they swapped.

from PIL import Image
input = Image.new(mode="RGB",size=(10, 10),
                        color="blue")
input.save("input", format="png")
pixel_map = input.load()
pixel_map[1,4] = (255,0,0)
input.save("path\example.png", format="png")

edit:
I have added a red thick red line in the middle.
So regarding this code, the line should vertical, not horizontal like it is

// this code goes instead of line:  pixel_map[1,4] = (255,0,0)

i = 0
j = 0
for i in range (10):
    for j in range (10):
        if j == 4 or j == 5:
            pixel_map[i,j] = (255,0,0)
            pixel_map[i,j] = (255,0,0)
Asked By: ForthRider

||

Answers:

Summary of my comments:

It is pretty standard to access digital images via [x, y] coordinates, as opposed to [y, x] or [y][x]. In mathematics, arrays are usually indexed by row and then column, but with images, the width component is conventionally first – hence why we say resolutions like "1920×1080", which is the X and then Y value. And, just like when accessing a cartesian coordinate plane in mathematics, X refers to the horizontal component, and is first, while Y is second and refers to the vertical component. So, images tend to be treated more like a coordinate system than a matrix, at least when a layer of abstraction is added like PIL is doing. Hence, I think this can be confusing for those who are used to how 2D arrays are typically indexed.

Here is a post which does a great job explaining why there’s different coordinate systems. It’s far more detailed and well-researched than what I’m capable of coming up with right now.

Like I said I think there’s just some understandable confusion when it comes to transitioning from thinking of the first index as the row, and the second as the column, when with digital images it’s the other way around usually. In the end, the order is just determined by the tool you are using. Some tools use pixel coordinates (x, y) while others use matrix coordinates (y, x). Matrix coordinates are indeed how images are usually internally stored, but I think the (x, y) order is a layer of "convenience" that is added sometimes. This thread has some related discussion: why should I use (y,x) instead of (x,y) to access a pixel in opencv?

If you look at the Pillow source code that actually gets the pixel when accessing the loaded image data, you’ll see that it actually indexes self.pixels[y][x]. So, internally, it’s being stored how you expect; it’s just that the pixel access object deliberately chose the index to be in (x, y) order. The PixelAccess object can be indexed via [x, y] but internally it stores the pixels as [y][x], so internally it swaps the order. You do not have access to this internal representation as far as I know. That’s just an internal implementation detail though.

Answered By: Random Davis