OpenCV functions specifically built for uint8 datatype

Question:

I am using the cv2.calcOpticalFlowFarneback() function in Python to calculate the optical flow between two images and find the translation vector between the two displaced images. The images which I’m using have been generated by an atomic force microscope, which gives resolutions for each pixel value significantly greater than that which can be represented by the numbers from 0->255.

In fact, the pixel data at each point of the image is a 32-bit float between 0 and 1. I’d like to use this data with the optical flow function, rather than multiplying it all by 255 and rounding to the nearest integer, as this leads to a loss of information.

As far as I can tell, the OpenCV functions don’t like data types that aren’t uint8 (is this simply a limitation?). I’m wondering if there’s some way around this? Perhaps there’s a specifier which I’m unaware of, for example.

Thanks in advance.

Asked By: I hate coding

||

Answers:

Many functions in OpenCV have such limitations. OpenCV is written under the assumption you’ll use it with a simple video camera that produces 8 bit images. For more precise work you’re better off switching to a different library that is happy to work with floating-point images. In Python you have scikit-image, which is quite good, and DIPlib, which is way better (I’m an author, so quite biased!).

In scikit-image you have two optical flow functions: skimage.registration.optical_flow_ilk and skimage.registration.optical_flow_tvl1.

DIPlib does’t have any built-in optical flow functions, but it’s quite easy to construct them, because it makes it easy to do linear algebra on pixels. The classical Lucas-Kanade optical flow would be:

img = ...  # your gray-scale floating-point image, with time as the 3rd dimension

gradient_sigma = 1.0
window_sigma = 5.0
A = dip.Gradient(img, gradient_sigma, process=[True, True, False]);
b = -dip.Derivative(img, [0, 0, 1], gradient_sigma);
ATA = dip.Gauss(A @ dip.Transpose(A), window_sigma);
ATb = dip.Gauss(A @ b, window_sigma);
v = dip.Inverse(ATA) @ ATb;

In this code, img, as a NumPy array, would have time as the first index, y as the second index, and x as the third index). You can construct it by putting the series of images into a list and converting that to an array.

Answered By: Cris Luengo
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.