Extracting separate images from YOLO bounding box coordinates
Question:
I have a set of images and their corresponding YOLO coordinates. Now I want to extract the objects that these YOLO coordinates denote into separate images.
But these coordinates are in floating point notation and hence am not able to use splicing.
This is an image Sample Image and the corresponding YOLO coordinates are
labels = [0.536328, 0.5, 0.349219, 0.611111]
I read my image as follows :
image = cv2.imread('frame0.jpg')
Then I wanted to use something like image[y:y+h,x:x+w]
as I had seen in a similar question. But the variables are float, so I tried to convert them into integers using the dimensions of the image 1280 x 720
like this :
object = [int(label[0]*720), int(label[1]*720), int(label[2]*1280), int(label[3]*1280)]
x,y,w,h = object
But it doesn’t get the part of the image correctly as you can see over here extractedImage
This is part of my training dataset, so I had cropped these parts earlier using some tools, so there would not be any errors in my labels. Also all the images are incorrectlly cropped this way, I have shown the output for 1 of the images.
Thanks a lot in advance. Any suggestions would be really helpful !
Answers:
The labels need to be normalized differently – since the x
and y
are with respect to the center of the screen, they’re actually multiplied by W/2
and H/2
, respectively. Also, the width and height dimensions have to be multiplied by W
and H
, respectively – they’re currently both being normalized by the W
(1280). Here’s how I solved it:
import cv2
import matplotlib.pyplot as plt
label = [0.536328, 0.5, 0.349219, 0.611111]
img = cv2.imread('P6A4J.jpg')
H, W, _ = img.shape
object = [int(label[0]*W/2), int(label[1]*H/2), int(label[2]*W), int(label[3]*H)]
x,y,w,h = object
plt.subplot(1,2,1)
plt.imshow(img)
plt.subplot(1,2,2)
plt.imshow(img[y:y+h, x:x+w])
plt.show()
plt.show()
Output:
]1
Hope this helps!
detect.py
Crops will be saved under runs/detect/exp/crops, with a directory for each class detected.
python detect.py --save-crop
I have a set of images and their corresponding YOLO coordinates. Now I want to extract the objects that these YOLO coordinates denote into separate images.
But these coordinates are in floating point notation and hence am not able to use splicing.
This is an image Sample Image and the corresponding YOLO coordinates are
labels = [0.536328, 0.5, 0.349219, 0.611111]
I read my image as follows :
image = cv2.imread('frame0.jpg')
Then I wanted to use something like image[y:y+h,x:x+w]
as I had seen in a similar question. But the variables are float, so I tried to convert them into integers using the dimensions of the image 1280 x 720
like this :
object = [int(label[0]*720), int(label[1]*720), int(label[2]*1280), int(label[3]*1280)]
x,y,w,h = object
But it doesn’t get the part of the image correctly as you can see over here extractedImage
This is part of my training dataset, so I had cropped these parts earlier using some tools, so there would not be any errors in my labels. Also all the images are incorrectlly cropped this way, I have shown the output for 1 of the images.
Thanks a lot in advance. Any suggestions would be really helpful !
The labels need to be normalized differently – since the x
and y
are with respect to the center of the screen, they’re actually multiplied by W/2
and H/2
, respectively. Also, the width and height dimensions have to be multiplied by W
and H
, respectively – they’re currently both being normalized by the W
(1280). Here’s how I solved it:
import cv2
import matplotlib.pyplot as plt
label = [0.536328, 0.5, 0.349219, 0.611111]
img = cv2.imread('P6A4J.jpg')
H, W, _ = img.shape
object = [int(label[0]*W/2), int(label[1]*H/2), int(label[2]*W), int(label[3]*H)]
x,y,w,h = object
plt.subplot(1,2,1)
plt.imshow(img)
plt.subplot(1,2,2)
plt.imshow(img[y:y+h, x:x+w])
plt.show()
plt.show()
Output:
]1
Hope this helps!
detect.py
Crops will be saved under runs/detect/exp/crops, with a directory for each class detected.
python detect.py --save-crop