Python + Image Processing: Efficiently Assign Pixel Values to Nearest Predefined Value

Question:

I implemented an algorithm that uses opencv kmeans to quantize the unique brightness values present in a greyscale image. Quantizing the unique values helped avoid biases towards image backgrounds which are typically all the same value.

However, I struggled to find a way to utilize this data to quantize a given input image.

I implemented a very naive solution, but it is unusably slow for the required input sizes (4000×4000):

for x in range(W):
    for y in range(H):
        center_id = np.argmin([(arr[y,x]-center)**2 for center in centers])
        ret_labels2D[y,x] = sortorder.index(center_id)
        ret_qimg[y,x] = centers[center_id]

Basically, I am simply adjusting each pixel to the predefined level with the minimum squared error.

Is there any way to do this faster? I was trying to process an image of size 4000×4000 and this implementation was completely unusable.

Full code:

def unique_quantize(arr, K, eps = 0.05, max_iter = 100, max_tries = 20):

    """@param arr: 2D numpy array of floats"""

    H, W = arr.shape

    unique_values = np.squeeze(np.unique(arr.copy()))

    unique_values = np.array(unique_values, float)

    if unique_values.ndim == 0:
        unique_values = np.array([unique_values],float)

    unique_values = np.ravel(unique_values)

    unique_values = np.expand_dims(unique_values,1)

    Z = unique_values.astype(np.float32)

    criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER,max_iter,eps)

    compactness, labels, centers = cv2.kmeans(Z,K,None,criteria,max_tries,cv2.KMEANS_RANDOM_CENTERS)

    labels = np.ravel(np.squeeze(labels))
    centers = np.ravel(np.squeeze(centers))

    sortorder = list(np.argsort(centers)) # old index --> index to sortorder

    ret_center = centers[sortorder]
    ret_labels2D = np.zeros((H,W),int)
    ret_qimg = np.zeros((H,W),float)

    for x in range(W):
        for y in range(H):
            center_id = np.argmin([(arr[y,x]-center)**2 for center in centers])
            ret_labels2D[y,x] = sortorder.index(center_id)
            ret_qimg[y,x] = centers[center_id]

    return ret_center, ret_labels2D, ret_qimg

EDIT: I looked at the input file again. The size was actually 12000×12000.

Asked By: Michael Sohnen

||

Answers:

As your image is grayscale (presumably 8 bits), a lookup-table will be an efficient solution. It suffices to map all 256 gray-levels to the nearest center once for all, then use this as a conversion table. Even a 16 bits range (65536 entries) would be significantly accelerated.

Answered By: Yves Daoust

I recently thought of a much better answer. This code is not extensively tested, but it worked for the use case in my project.

I made use of obscure fancy-indexing techniques in order to keep the entire algorithm contained within numpy functions.

def unique_quantize(arr, K, eps = 0.05, max_iter = 100, max_tries = 20):

    """@param arr: 2D numpy array of floats"""

    H, W = arr.shape

    unique_values = np.squeeze(np.unique(arr.copy()))

    unique_values = np.array(unique_values, float)

    if unique_values.ndim == 0:
        unique_values = np.array([unique_values],float)

    unique_values = np.ravel(unique_values)

    unique_values = np.expand_dims(unique_values,1)

    Z = unique_values.astype(np.float32)

    criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER,max_iter,eps)

    compactness, labels, centers = cv2.kmeans(Z,K,None,criteria,max_tries,cv2.KMEANS_RANDOM_CENTERS)

    labels = np.ravel(np.squeeze(labels))
    centers = np.ravel(np.squeeze(centers))

    sortorder = np.argsort(centers) # old index --> index to sortorder

    inverse_sortorder = np.array([list(sortorder).index(i) for i in range(len(centers))],int)

    ret_center = centers[sortorder]
    ret_labels2D = np.zeros((H,W),int)
    ret_qimg = np.zeros((H,W),float)

    errors = [np.power((arr-center),2) for center in centers]
    errors = np.array(errors,float)

    classification = np.squeeze(np.argmin(errors,axis=0))

    ret_labels2D = inverse_sortorder[classification]

    ret_qimg = centers[classification]

    return np.array(ret_center,float), np.array(ret_labels2D,int), np.array(ret_qimg,float)
Answered By: Michael Sohnen