Object detection has been an important task in the computer vision domain in recent decades. The goal is to detect instances of objects, such as humans, cars, etc., in digital images. Hundreds of methods have been developed to answer a single question: What objects are where?


Copyright: marktechpost.com – “This Artificial Intelligence (AI) Model Knows How to Detect Novel Objects During Object Detection”


Traditional methods tried to answer this question by extracting hand-crafted features like edges and corners within the image. Most of these approaches used a sliding-window approach, meaning that they kept checking small parts of the image in different scales to see if any of these parts contained the object they were looking for. This was really time-consuming, and even the slightest change in the object shape, lightning, etc., could have caused the algorithm to miss it.

Then there came the deep learning era. With the increasing capability of computer hardware and the introduction of large-scale datasets, it became possible to exploit the advancement in the deep learning domain to develop a reliable and robust object detection algorithm that could work in an end-to-end manner.

Using deep learning methods can result in extremely successful object detection methods. They are robust against the changes in the environment and objects in the image. Most of them can run in real time, even on mobile devices. Sounds really good, right? Does that mean we can say the object detection problem is solved for good? Well, not yet.

The problem we have is all these methods are bounded by the dataset they are trained on. If you train your model to detect pandas in the image, you will use lots of panda images to teach them what it looks like. Collecting those images is one aspect, but the bigger problem is labeling them. Going over thousands of images and marking the exact locations of pandas in each image is an extremely time-consuming task.

Thank you for reading this post, don't forget to subscribe to our AI NAVIGATOR!


Also, you would need to do this for each object you want your model to recognize. Imagine you want to develop a generic object detection model that recognizes all the objects it will see. You can use large-scale datasets like COCO that include a variety of objects, but you will still be limited to the number of different categories in your dataset.[…]

Read more: www.marktechpost.com

Read the paper here.