Image-processing convolution kernels are calculated dynamically
Question:
Using standard numpy
and cv2.filter2D
solutions I can apply static convolutions to an image:
import numpy as np
convolution_kernel = np.array([[-2, -1, 0],
[-1, 1, 1],
[0, 1, 2]])
import cv2
image = cv2.imread('1.png') result = cv2.filter2D(image, -1, convolution_kernel)
(example from https://stackoverflow.com/a/58383803/3310334)
Every pixel at [i, j]
in the output image has a value calculated by centering a 3×3 "window" onto [i, j]
in the input image, and then multiplying each value in the window by the corresponding value in the convolution kernel (Hadamard product) and finally summing the 9 products to get the value for [i, j]
in the output image (for each color channel).
[][4]
(image from: https://github.com/ashushekar/image-convolution-from-scratch#convolution)
In my case, the function to perform to calculate for each output pixel is not as simple as sum of Hadamard product. It is for each pixel calculated from operations performed on known-size windows into two input matrices centered around that pixel.
I have two input matrixes ("images"), like
A = [[179, 97, 77, 118, 144, 105],
[ 68, 56, 184, 210, 141, 230],
[178, 166, 218, 47, 106, 172],
[ 38, 183, 50, 185, 48, 87],
[ 60, 200, 228, 232, 6, 190],
[253, 75, 231, 166, 117, 134]]
B = [[116, 95, 94, 220, 80, 223],
[135, 9, 166, 78, 5, 129],
[102, 167, 120, 81, 141, 29],
[ 83, 117, 81, 129, 255, 48],
[130, 231, 165, 7, 187, 169],
[ 44, 137, 16, 50, 229, 202]]
And in the output matrix, each [i, j]
pixel should be calculated as the sum of all of A[u,v] ** 2 - B[u,v] ** 2
values for [u, v]
coordinates within 3×3 "windows" onto the two (same-sized) input matrixes.
How can I calculate this output matrix quickly in Python?
Using numpy, it seems to be the 3×3 sums of A * A - B * B
, but how to do those sums? Or is there another "2d map" process I could be using?
I’ve written a loop-based solution to calculate the expected output for these two examples:
W = 3 # size of kernel is WxW
out = np.zeros(A.shape)
difference_of_squares = A * A - B * B
for i, j in np.ndindex(out.shape):
starti = max(i - W//2, 0) # use smaller kernels at input's boundaries, output will have same dimension as input
stopi = min(i - W//2 + W, np.shape(out)[0]) # I'm not worried at this point about what happens at boundaries
startj = max(j - W//2, 0) # standard convolution solutions are often just reducing output size or padding input with zeroes
stopj = min(j - W//2 + W, np.shape(out)[1])
out[i, j] = np.sum(difference_of_squares[starti:stopi, startj:stopj])
print(out)
[[ 8423. 11816. 10372. 41125. 35287. 31747.]
[ 29370. 65887. 38811. 61252. 51033. 51845.]
[ 24756. 60119. 109133. 35101. 70005. 18757.]
[ 8641. 62463. 126935. 14530. 2255. -64752.]
[ 36623. 110426. 163513. 33812. -50035. -146450.]
[ 22268. 100132. 130190. 83010. -10163. -88994.]]
Answers:
You can use scipy.signal.convolve2d
:
from scipy.signal import convolve2d
# Same shape as original (6x6)
>>> convolve2d(A**2-B**2, np.ones((3, 3)), mode='same')
array([[ 18585., 18969., -2523., -37383., -20140., -24368.],
[ 20073., 23512., 6832., 6933., 37732., 31747.],
[ 22768., 25490., 42146., 73088., 44875., 76213.],
[ 15835., 35302., 48530., 42295., -58134., -37358.],
[-18745., -12306., 27137., 92057., -26295., -84850.],
[ 48773., 22268., 51359., 156695., 5146., -88994.]])
# Shape reduce by 1 (5x5)
>>> convolve2d(A**2-B**2, kernel, mode='valid')
array([[ 23512., 6832., 6933., 37732., 31747.],
[ 25490., 42146., 73088., 44875., 76213.],
[ 35302., 48530., 42295., -58134., -37358.],
[-12306., 27137., 92057., -26295., -84850.],
[ 22268., 51359., 156695., 5146., -88994.]])
Note: You have to play around with the "mode" and "limit" parameters until you get what you want.
Update
If the border is not a problem at this point, you can use sliding_window_view
:
from numpy.lib.stride_tricks import sliding_window_view
>>> np.sum(sliding_window_view(A**2-B**2, (3, 3)), axis=(2, 3))
array([[ 65887, 38811, 61252, 51033],
[ 60119, 109133, 35101, 70005],
[ 62463, 126935, 14530, 2255],
[110426, 163513, 33812, -50035]])
Using standard numpy
and cv2.filter2D
solutions I can apply static convolutions to an image:
import numpy as np convolution_kernel = np.array([[-2, -1, 0], [-1, 1, 1], [0, 1, 2]]) import cv2 image = cv2.imread('1.png') result = cv2.filter2D(image, -1, convolution_kernel)
(example from https://stackoverflow.com/a/58383803/3310334)
Every pixel at [i, j]
in the output image has a value calculated by centering a 3×3 "window" onto [i, j]
in the input image, and then multiplying each value in the window by the corresponding value in the convolution kernel (Hadamard product) and finally summing the 9 products to get the value for [i, j]
in the output image (for each color channel).
[][4]
(image from: https://github.com/ashushekar/image-convolution-from-scratch#convolution)
In my case, the function to perform to calculate for each output pixel is not as simple as sum of Hadamard product. It is for each pixel calculated from operations performed on known-size windows into two input matrices centered around that pixel.
I have two input matrixes ("images"), like
A = [[179, 97, 77, 118, 144, 105],
[ 68, 56, 184, 210, 141, 230],
[178, 166, 218, 47, 106, 172],
[ 38, 183, 50, 185, 48, 87],
[ 60, 200, 228, 232, 6, 190],
[253, 75, 231, 166, 117, 134]]
B = [[116, 95, 94, 220, 80, 223],
[135, 9, 166, 78, 5, 129],
[102, 167, 120, 81, 141, 29],
[ 83, 117, 81, 129, 255, 48],
[130, 231, 165, 7, 187, 169],
[ 44, 137, 16, 50, 229, 202]]
And in the output matrix, each [i, j]
pixel should be calculated as the sum of all of A[u,v] ** 2 - B[u,v] ** 2
values for [u, v]
coordinates within 3×3 "windows" onto the two (same-sized) input matrixes.
How can I calculate this output matrix quickly in Python?
Using numpy, it seems to be the 3×3 sums of A * A - B * B
, but how to do those sums? Or is there another "2d map" process I could be using?
I’ve written a loop-based solution to calculate the expected output for these two examples:
W = 3 # size of kernel is WxW
out = np.zeros(A.shape)
difference_of_squares = A * A - B * B
for i, j in np.ndindex(out.shape):
starti = max(i - W//2, 0) # use smaller kernels at input's boundaries, output will have same dimension as input
stopi = min(i - W//2 + W, np.shape(out)[0]) # I'm not worried at this point about what happens at boundaries
startj = max(j - W//2, 0) # standard convolution solutions are often just reducing output size or padding input with zeroes
stopj = min(j - W//2 + W, np.shape(out)[1])
out[i, j] = np.sum(difference_of_squares[starti:stopi, startj:stopj])
print(out)
[[ 8423. 11816. 10372. 41125. 35287. 31747.]
[ 29370. 65887. 38811. 61252. 51033. 51845.]
[ 24756. 60119. 109133. 35101. 70005. 18757.]
[ 8641. 62463. 126935. 14530. 2255. -64752.]
[ 36623. 110426. 163513. 33812. -50035. -146450.]
[ 22268. 100132. 130190. 83010. -10163. -88994.]]
You can use scipy.signal.convolve2d
:
from scipy.signal import convolve2d
# Same shape as original (6x6)
>>> convolve2d(A**2-B**2, np.ones((3, 3)), mode='same')
array([[ 18585., 18969., -2523., -37383., -20140., -24368.],
[ 20073., 23512., 6832., 6933., 37732., 31747.],
[ 22768., 25490., 42146., 73088., 44875., 76213.],
[ 15835., 35302., 48530., 42295., -58134., -37358.],
[-18745., -12306., 27137., 92057., -26295., -84850.],
[ 48773., 22268., 51359., 156695., 5146., -88994.]])
# Shape reduce by 1 (5x5)
>>> convolve2d(A**2-B**2, kernel, mode='valid')
array([[ 23512., 6832., 6933., 37732., 31747.],
[ 25490., 42146., 73088., 44875., 76213.],
[ 35302., 48530., 42295., -58134., -37358.],
[-12306., 27137., 92057., -26295., -84850.],
[ 22268., 51359., 156695., 5146., -88994.]])
Note: You have to play around with the "mode" and "limit" parameters until you get what you want.
Update
If the border is not a problem at this point, you can use sliding_window_view
:
from numpy.lib.stride_tricks import sliding_window_view
>>> np.sum(sliding_window_view(A**2-B**2, (3, 3)), axis=(2, 3))
array([[ 65887, 38811, 61252, 51033],
[ 60119, 109133, 35101, 70005],
[ 62463, 126935, 14530, 2255],
[110426, 163513, 33812, -50035]])