Train Tensorflow Obejct Detection without bounding boxes annotations

Question:

I managed to retrain the object detection module on my own dataset by adhering it to the PASCAL VOC format shown below.

This format is bounding box oriented and peeking into their TFRecords creation scripts, it does expect a good number of these groundtruth values to generate corresponding TFRecords.

The problem with bounding boxes, is that it gives you approximations and annotating rotated images can be rather challenging.

After looking around, I came across labelme which allows you to perform shape (point to point) annotations, as well instead of just bounding box. Below is a short version of the produced annotation along with the resulting image consisting of the resulting shape.

My questions are:

  1. Concentrating on the contents of <polygon></polygon>, does the Object Detection API support point to point annotations?

  2. If yes to 1, how do I go about creating the TFRecords for it? What other changes need to be made to accommodate this?

Pascal VOC Format

<annotation verified="no">
  <folder>VOC2012</folder>
  <filename>pic.jpg</filename>
  <source>
    <database>Unknown</database>
  </source>
  <size>
    <width>214</width>
    <height>300</height>
    <depth>3</depth>
  </size>
  <segmented>0</segmented>
  <object>
    <name>sample</name>
    <pose>Unspecified</pose>
    <truncated>0</truncated>
    <difficult>0</difficult>
    <bndbox>
      <xmin>32</xmin>
      <ymin>37</ymin>
      <xmax>180</xmax>
      <ymax>268</ymax>
    </bndbox>
  </object>
</annotation>

Snapshot of point-to-point annotation

Here’s the full annotation file and the corresponding image

<annotation>
    <filename>ipad.jpg</filename>
    <folder>sample</folder>
    <source>
    <submittedBy>username</submittedBy>
    </source>
    <imagesize>
        <nrows>450</nrows>
        <ncols>800</ncols>
    </imagesize>
    <object>
        <name>ipad</name>
        <deleted>0</deleted><verified>0</verified><occluded>no</occluded>
        <attributes></attributes>
        <parts>
            <hasparts></hasparts>
            <ispartof></ispartof>
        </parts>
        <date>12-Jul-2017 19:20:22</date><id>0</id>
        <polygon>
            <username>anonymous</username>
            <pt><x>40</x><y>76</y></pt>
            <pt><x>435</x><y>11</y></pt>
            <pt><x>472</x><y>311</y></pt>
            <pt><x>94</x><y>418</y></pt>
        </polygon>
    </object>
    <object>
        <name>screen</name>
        <deleted>0</deleted>
        <verified>0</verified>
        <occluded>no</occluded>
        <attributes></attributes>
        <parts>
            <hasparts></hasparts>
            <ispartof></ispartof>
        </parts>
        <date>12-Jul-2017 19:20:48</date><id>1</id>
        <polygon>
            <username>anonymous</username>
            <pt><x>75</x><y>89</y></pt>
            <pt><x>118</x><y>397</y></pt>
            <pt><x>447</x><y>308</y></pt>
            <pt><x>421</x><y>30</y></pt>
        </polygon>
    </object>
</annotation>
Asked By: eshirima

||

Answers:

The Tensorflow Object Detection API only performs Bounding Box annoations.

Answered By: Derek Chow

We just unlocked the secrets of semi-automatic annotation with a touch of coolness! By combining extreme clicking with a powerful tiny YOLO v4, we made annotation a breeze. Get ready to revolutionize your annotation game! semi automatic extreme clicking

Answered By: Daniel Gacon