Interactive Annotation with SAM – Speed Up the Time to Model

May 14, 2024

Get Started with Intel Geti Software.

author-image

By

In this post, we will discuss how we are utilizing the power of Segment Anything Model (SAM) in an interactive annotation process and speeding up the time to model for users of Intel® Geti™ software.

Data preparation and the quality of data annotations have been key challenges for AI teams to build a robust computer vision model development pipeline. Often the workflow involves hiring external users, sometimes referred to as labelers or annotators, to label images and video data. The data science and domain expert teams then spend time reviewing those labels and cleaning up the data in preparation to train a neural network model. There needs to be a large enough buffer built into this external labeling process – often could be three times more data sent for labeling. This ensures that enough good data is available after the data cleaning for subsequent processes. Such workflows also make the annotation and model development work completely isolated from each other – often leading to longer timescales in just data preparation and getting started with model training.

Intel Geti software is built to break this bottleneck by bringing the annotation and labeling workflows together with the model training and optimization workflows. Advanced algorithms like active learning take advantage of feedback from the human expert in the loop helping reduce the data quality concerns, thus making the overall model development workflow highly efficient.

In this blog, we will explore the smart annotation capabilities in Intel Geti software, with a special focus on the new interactive annotation, that form the pillar of increasing annotation efficiencies and ultimately helping ensure annotation quality as well.

Smart Annotation 

The smart annotation features in Intel Geti software are at the heart of enabling users to annotate data with ease. It also enables users to continuously improve the model being trained by providing feedback to model predictions in the active learning process.

The core capabilities overview demo video as well as the “what’s under the hood” blog posts cover some of these smart annotation features in Intel Geti software, and how they help speed up model development workflow.

Task-specific annotation and labeling features make it easy for users to choose specific types of annotation features to label their data. For example, for a rotated object detection task, the rotated bounding boxes become the annotation feature of choice. The segmentation tasks provide many more annotation features – from freehand drawing using a polygon tool to interactive segmentation that can automatically identify object shapes from minimal guidance provided by a user. You add a green dot by using a left click of the mouse in the object of interest, and the algorithm tries to identify a region around the marker. If the identified region needs to be extended, you add those extension markers with the left click. And, certain areas need to be subtracted, you can just use the right click of the mouse within the selected area to remove a part.

Smart annotation available in Intel Geti software

New Interactive Annotation Features

In the latest product release: Intel Geti 1.8.0, new and powerful interactive annotation features have been added to enable automatic annotation for object detection (both axis-aligned and rotated) and segmentation tasks. Users can simply hover over an object and a relevant annotation shape is created for users to accept. Be it axis-aligned bounding boxes or rotated bounding boxes – the relevant shapes are created accordingly for the relevant task type. If the automatic annotations need to be modified, they can be easily fine-tuned by selectively adding or removing any area around the shapes.

Interactive annotation with SAM in action. Just hover over the object, and click to annotate – be it for axis aligned and rotated object detection tasks, or instance and semantic segmentation tasks

Powered by Vision Transformers

You might ask, what makes these new interactive annotation features so uniquely special? This new annotation feature is powered by vision transformers – the same kind of model architecture that also powers models like the Segment Anything Model (SAM) released by Meta. While SAM enables impressive zero-shot learning, i.e., enables predicting objects from simple instructions without needing to train a model, it has certain limitations to being able to solve complex problems – such as needing large computations that may be very slow on resource-constrained client and edge devices or ability to customize for specific datasets for more fine-tuned performance. Many new advancements in the approach are being developed by the scientific community to solve such bottlenecks. We incorporate one such MobileSAM model for this interactive auto-annotation feature in Intel Geti software. This model is much smaller in compute requirements enabling rapid computations on the client device, without needing to communicate with a remote, expensive, and powerful server, while annotating each object. The algorithm is further enhanced to provide accurate bounding boxes and segmentation shapes for the objects – to produce high-quality annotations for training the neural network model for your custom use cases.

On this mission to help our customers more easily and rapidly build innovative computer vision solutions to solve their challenges, we continue to add advanced and differentiated features to Intel Geti software.