Back to Blogs

Contents

Existing Practices
Using the Annotator Training Module
Best Practices for Using the Module
Conclusion

Encord Blog

How to Use the Annotator Training Module

March 8, 2023

5 mins

Back to Blogs

Data infrastructure for multimodal AI

Click around the platform to see the product in action.

Explore the platform

Contents

Existing Practices
Using the Annotator Training Module
Best Practices for Using the Module
Conclusion

Written by

Alexandre Bonnet

View more posts

TLDR;
The purpose of this post is to introduce the Annotator Training Module we use at Encord to help leading AI companies quickly bring their annotator team up to speed and improve the quality of annotations created. We have created the tool to be flexible for all computer vision labeling tasks across various domains including medical imaging, agriculture, autonomous vehicles, and satellite imaging. It can be used for all annotation types - from bounding boxes and polygons to segmentation, polylines, and classification.

Correct annotations and labels are key to training high-quality machine learning models. Annotated objects can range from simple bounding boxes to complex segmentations. We may require annotators to capture additional data describing the objects they are annotating. Encord’s powerful ontology editor allows us to define nested attributes to capture as much data as needed.

Even for seemingly simple object primitives, such as bounding boxes, there may be nuances in the data which annotators need to account for. These dataset-specific idiosyncrasies can be wide-ranging, such as object occlusion or ambiguities in deciding the class of an object. It's critical to ensure consistency and accuracy in the annotation process.

Existing Practices

Data operations teams today follow old and outdated practices including having teams view the data using simple tools such as video players and then answer questions before starting annotations.

This does not address the true complexity of accurately annotating many datasets at the quality level required for machine learning algorithms.

Teaching annotators how to work with data, understand labeling protocols, and learn annotation tools can take weeks or even months.

By combining all three into one, and automating the evaluation process, our new module enables a data operations team to scale its efforts across hundreds of annotators in a fraction of the time – allowing for large gains in cost-savings, efficiency, and helping teams focus on educational efforts on the most difficult assets to annotate.

One platform for creating better training data and debugging models.

See Encord in Action

To this end, Encord Annotate now comes with a new powerful Annotator Training Module out-of-the-box so that annotators can learn what is expected of them during the annotation process.

At a high level, this consists of first adding ground truth annotations to the platform against which annotators will be evaluated. During the training process, annotators are fed unlabelled items from the ground truth dataset, which they must label.

A customizable scoring function converts their annotation performance into numerical scores. These scores can be used to evaluate performance and decide when annotators are ready to progress to a live training project.

Using the Annotator Training Module

Guide contains following steps:

Step 1: Upload Data
Step 2: Set up Benchmark Project
Step 3: Create Ground Truths Labels
Step 4: Set up and Assign Training Projects
Step 5: Annotator Training
Step 6: Evaluation

blog_image_5279

This walkthrough will show you how to use the Annotator Training Module in the Encord Annotate Web app. This entire workflow can also be run programmatically using the SDK.

Step 1: Upload Data

First, you create a new dataset that will contain the data on which your ground truth labels are drawn. For this walkthrough, we have chosen to annotate images of flowers from an open source online dataset.

blog_image_5871

Step 2: Set up Benchmark Project

Next, you create a new standard project from the Projects tab in the Encord Annotate app.

blog_image_6166

You name the dataset and add an optional description (We recommend to tag it as a Training Ground Truth dataset).

blog_image_6445

We then attach the dataset created in Step 1 containing the unlabelled flower images.

blog_image_6705

Now we create an ontology that will be appropriate to the flower labelling use case, we could also attach an existing ontology if we wanted.

blog_image_7010

Here you can see that we are specifying both scene-level classifications and geometrical objects (both bounding boxes and polygons). Within the objects being defined, you are making use of Encord’s flexible ontology editor to define nested classifications. This helps you capture all the data describing the annotated objects in one place.

blog_image_7618

And lastly, you create the project.

blog_image_7811

Step 3: Create Ground Truth Labels

Now that you have created your first benchmark project, you need to create ground truth labels. This can be achieved in two ways.

The first option is having subject matter experts use Encord to manually annotate data units, as shown here with the bounding boxes drawn around the flowers.
The second option is to use the SDK to programmatically upload labels that were generated outside Encord.

blog_image_8471

blog_image_8622

Now that you have created the ground truth labels, proceed to set up the training projects.

Step 4: Set up and Assign Training Projects

Let us create a training project using the training tab in the project section.

blog_image_9036

Create your Training project and add an optional description.

blog_image_9261

It is important that you select the same ontology as the benchmark project. This is because the scoring functions will be comparing the trainee annotations to the ground truth annotations.

blog_image_9614

Next, we can set up the scoring. This will assign scores to the annotator submissions. Two key numbers are calculated:

Intersection over Union (IoU): IoU is calculated for objects such as bounding boxes or polygons. The IoU is the fraction of overlap between the benchmark and trainee annotations.
Comparison: The comparison compares whether two values are the same, for example the flower species.

You can then use the numbers to express the relative weights of different components of the annotations.

A higher score means that a component will be more important in calculating the overall score for an annotator. You can also think of this as the ‘number of points available’ for getting this part correct. You can see that I have given the flower species a weight of 100, whereas the flower color has a weight of 10 since it is less important to my use case and so if an annotator misses this or gets it wrong, then they will miss out on fewer points.

blog_image_10829

Finally, we assign annotators to the training module.

blog_image_11047

blog_image_11197

Step 5: Annotator Training

Each annotator will now see the labeling tasks assigned to them.

blog_image_11460

Step 6: Evaluation

As the creator of the training module, you can see the performance of annotators as they progress through the training. Here you can see that my two trainees are progressing through the training module, having both completed around 20% of the assigned tasks. You can also see their overall score as a percentage. This score is calculated by the scoring function we set up during the project setup.

blog_image_12056

You can dive deeper into individual annotator performance by looking at the submissions tab, which gives you a preview of annotator submissions. For very large projects we can use the CSV export function to get all submissions.

blog_image_12447

We can now dive deeper into annotator submissions, looking at this example where we notice some mistakes our trainee has made. You can see three things:

The trainee mislabelled the flower species.
The IoU score for the flower is low (143/200), indicating that the bounding box annotation is not precise.
The trainee forgot to describe the scene.

blog_image_12983

By clicking ‘View’, we can see the annotations and indeed realize that this is a poor-quality annotation.

The ground truth annotation is shown on the left and the trainee image annotation is shown on the right.

blog_image_13370

You can also change the scoring function if you later decide that certain attributes are more important than others by navigating to the settings tab.

Once you have modified the scoring function, you need to hit ‘Recalculate Scores’ on the Summary tab to get the new scores.

blog_image_13822

As already mentioned, you can download a CSV to perform further programmatic analysis of the trainee's performance.

Best Practices for Using the Module

To ensure the success of your training project, it is important to follow some best practices when using Encord's Annotator Training Module:

Define the annotation task clearly:

It is important to provide a clear and concise description of the annotation task to ensure that annotators understand the requirements.

Use reviewed ground truth labels:

Providing reviewed ground truth labels ensures that the annotators have a clear understanding of what is required and helps to measure the accuracy of their annotations.

Evaluate annotator performance regularly:

Evaluating annotator performance regularly ensures that the annotations are of high quality and identifies any areas where additional training may be required.

Continuously improve the annotation training task:

As you progress, it is important to continuously review and improve the training tasks you have set up to ensure that it is meeting the project requirements.

Conclusion

Designed to help machine learning and data operation teams streamline their data labeling onboarding efforts by using existing training data to rapidly upskill new annotators, the new Annotator Training Module enables annotators to get up to speed quickly. This rapid onboarding ensures that businesses can derive insights and make better decisions from their data in a timely manner.

One platform for creating better training data and debugging models.

See Encord in Action

Want to stay updated?

Follow us on Twitter and LinkedIn for more content on computer vision, training data, and active learning.
Join the Slack community to chat and connect.