How to write PIL image filter for plain pgm format?

Question:

How can I write a filter for python imaging library for pgm plain ascii format (P2). Problem here is that basic PIL filter assumes constant number of bytes per pixel.

My goal is to open feep.pgm with Image.open(). See http://netpbm.sourceforge.net/doc/pgm.html or below.

Alternative solution is that I find other well documented ascii grayscale format that is supported by PIL and all major graphics programs. Any suggestions?

feep.pgm:

P2
# feep.pgm
24 7
15
0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
0  3  3  3  3  0  0  7  7  7  7  0  0 11 11 11 11  0  0 15 15 15 15  0
0  3  0  0  0  0  0  7  0  0  0  0  0 11  0  0  0  0  0 15  0  0 15  0
0  3  3  3  0  0  0  7  7  7  0  0  0 11 11 11  0  0  0 15 15 15 15  0
0  3  0  0  0  0  0  7  0  0  0  0  0 11  0  0  0  0  0 15  0  0  0  0
0  3  0  0  0  0  0  7  7  7  7  0  0 11 11 11 11  0  0 15  0  0  0  0
0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0

edit: Thanks for the answer, It works… but I need a solution that uses Image.open(). Most of python programs out there use PIL for graphics manipulation (google: python image open). Thus, I need to be able to register a filter to PIL. Then, I can use any software that uses PIL. I now think mostly scipy, pylab, etc. dependent programs.

edit Ok, I think I got it now. Below is the wrapper pgm2pil.py:

import Image
import numpy

def pgm2pil(filename):

    try:
        inFile = open(filename)

        header = None
        size = None
        maxGray = None
        data = []

        for line in inFile:
            stripped = line.strip()

            if stripped[0] == '#': 
                continue
            elif header == None: 
                if stripped != 'P2': return None
                header = stripped
            elif size == None:
                size = map(int, stripped.split())
            elif maxGray == None:
                maxGray = int(stripped)
            else:
                for item in stripped.split():
                    data.append(int(item.strip()))

        data = numpy.reshape(data, (size[1],size[0]))/float(maxGray)*255
        return numpy.flipud(data)

    except:
        pass

    return None

def imageOpenWrapper(fname):
    pgm = pgm2pil(fname)
    if pgm is not None:
        return Image.fromarray(pgm)
    return origImageOpen(fname)

origImageOpen = Image.open
Image.open = imageOpenWrapper

There is a slight upgrade to misha’s answer. Image.open has to be saved in order to prevent never ending loops. If pgm2pil returns None wrapper calls pgm2pil which returns None which calls pgm2pil…

Below is the test function (feep_false.pgm is a malformed pgm e.g. “P2” -> “FOO” and lena.pgm is just the image file):

import pgm2pil
import pylab

try:
    pylab.imread('feep_false.pgm')
except IOError:
    pass
else:
    raise ValueError("feep_false should fail")

pylab.subplot(2,1,1)
a = pylab.imread('feep.pgm')
pylab.imshow(a)

pylab.subplot(2,1,2)
b = pylab.imread('lena.png')
pylab.imshow(b)

pylab.show()
Asked By: Juha

||

Answers:

The way I currently deal with this is through numpy:

  1. Read image into a 2D numpy array. You don’t need to use numpy, but I’ve found it easier to use than the regular Python 2D arrays
  2. Convert 2D numpy array into PIL.Image object using PIL.Image.fromarray

If you insist on using PIL.Image.open, you could write a wrapper that attempts to load a PGM file first (by looking at the header). If it’s a PGM, load the image using the steps above, otherwise just hands off responsibility to PIL.Image.open.

Here’s some code that I use to get a PBM image into a numpy array.

import re
import numpy

def pbm2numpy(filename):
    """
    Read a PBM into a numpy array.  Only supports ASCII PBM for now.
    """
    fin = None
    debug = True

    try:
        fin = open(filename, 'r')

        while True:
            header = fin.readline().strip()
            if header.startswith('#'):
                continue
            elif header == 'P1':
                break
            elif header == 'P4':
                assert False, 'Raw PBM reading not implemented yet'
            else:
                #
                # Unexpected header.
                #
                if debug:
                    print 'Bad mode:', header
                return None

        rows, cols = 0, 0
        while True:
            header = fin.readline().strip()
            if header.startswith('#'):
                continue

            match = re.match('^(d+) (d+)$', header)
            if match == None:
                if debug:
                    print 'Bad size:', repr(header)
                return None

            cols, rows = match.groups()
            break

        rows = int(rows)
        cols = int(cols)

        assert (rows, cols) != (0, 0)

        if debug:
            print 'Rows: %d, cols: %d' % (rows, cols)

        #
        # Initialise a 2D numpy array 
        #
        result = numpy.zeros((rows, cols), numpy.int8)

        pxs = []

        # 
        # Read to EOF.
        # 
        while True:
            line = fin.readline().strip()
            if line == '':
                break

            for c in line:
                if c == ' ':
                    continue

                pxs.append(int(c))

        if len(pxs) != rows*cols:
            if debug:
                print 'Insufficient image data:', len(pxs)
            return None

        for r in range(rows):
            for c in range(cols):
                #
                # Index into the numpy array and set the pixel value.
                #
                result[r, c] = pxs[r*cols + c]

        return result

    finally:
        if fin != None:
            fin.close()
        fin = None

    return None

You will have to modify it slightly to fit your purposes, namely:

  • Deal with P2 (ASCII, greyscale) instead of P1 (ASCII, bilevel).
  • Use a different container if you’re not using numpy. Normal Python 2D arrays will work just fine.

EDIT

Here is how I would handle a wrapper:

def pgm2pil(fname):
    #
    # This method returns a PIL.Image.  Use pbm2numpy function above as a
    # guide.  If it can't load the image, it returns None.
    #
    pass
    
def wrapper(fname):
    pgm = pgm2pil(fname)

    if pgm is not None:
        return pgm
    return PIL.Image.open(fname)

#
# This is the line that "adds" the wrapper
#
PIL.Image.open = wrapper

I didn’t write pgm2pil because it’s going to be very similar to pgm2numpy. The only difference will be that it’s storing the result in a PIL.Image as opposed to a numpy array. I also didn’t test the wrapper code (sorry, a bit short on time at the moment) but it’s a fairly common approach so I expect it to work.

Now, it sounds like you want other applications that use PIL for image loading to be able to handle PGMs. It’s possible using the above approach, but you need to be sure that the above wrapper code gets added before the first call to PIL.Image.open. You can make sure that happens by adding the wrapper source code to the PIL source code (if you have access).

Answered By: mpenkov