Addressing Nuanced Machine Learning Tasks with In-Depth Ontologies

Akruti Acharya
October 6, 2023
5 min read
blog image

TLDR

Ontologies play a central role in solving Computer Vision problems, offering structured data organization, consistent labeling, and giving your models the necessary information to perform as expected. 

An ontology can include different object types, classifications, and descriptive attributes, each annotated in distinct ways. In Encord you can create arbitrary complex, detailed, and nested ontologies, with Dynamic Classification adapting video annotations in real-time.

A good ontology ensures ongoing consistency and is essential for the success of your project. Complex ontologies have proven invaluable in complex computer vision tasks, as seen in real-world applications such as medical diagnosis, insurance claims, sports analytics, and agriculture. 

Understanding Ontologies

Ontologies provide a structured framework for categorizing data, ensuring uniform terminology, hierarchical organization, and consistent labeling. Annotations within ontologies conform to predefined concepts, relationships, properties, and attributes. This adherence significantly elevates data quality while reducing potential for errors. 

Ontologies are divided into two key components:

Objects

Objects represent individual entities or instances of the same entity within a domain. These are concrete elements that can be classified or categorized based on their attributes and characteristics. For example, in a medical ontology, individual patients, diseases, or forms of medication would be considered objects. 

When it comes to annotating these objects in your projects, various object annotation types can be employed:

Bounding Boxes

  • Rectangular boxes around objects of interest within images or frames.
  • Use Cases: Object detection (identifying and localizing objects within images), face recognition (enclosing faces), and vehicle tracking (tracking vehicles in surveillance footage).

Polygon

  • Annotators outline objects using polygons with multiple vertices.
  • Use Cases: Semantic segmentation (precisely delineating object boundaries in images), building footprint extraction from satellite imagery, annotating irregularly shaped objects.

Keypoint

  • Specific key points or landmarks on objects.
  • Use Cases: Pose estimation (annotating key points on human body parts), facial landmark detection (marking key points on a face), and hand gesture recognition.

Bitmask or Pixel-Level Annotation

  • Annotators can iteratively define regions of interest by assigning specific ‘bits’ or values to pixels, effectively categorizing them within the selected areas.
  • Use Cases: Semantic segmentation (pixel-wise object labeling in images), medical image segmentation (identifying structures in medical scans).

Classification

Classification represents broader categories or classes within the ontology. Using the medical ontology example, classifications could include categories like "Patients," "Diseases," or "Medication”. Ontologies refine classification tasks by organizing classes hierarchically, enhancing precision in labeling and prediction.

Unlike objects, classifications are applied to the entire frame and they typically fall into one of the following categories:

Checklist

  • Single option from a list of predefined categories.
  • Use Cases: Scene classification (e.g., categorizing images as "indoor" or "outdoor"), sentiment analysis (e.g., categorizing reviews as "positive," "negative," or "neutral").

Radio

  • Multiple options from a list of predefined categories.
  • Use Cases: Image tagging (e.g., tagging an image with multiple labels like "beach," "sunset," and "ocean"), document classification (e.g., labeling documents with multiple relevant topics).

Text

  • Freely input text to describe or classify data.
  • Use Cases: Text classification (e.g., categorizing text documents into custom-defined categories), user-generated content moderation (e.g., labeling text as "spam" or "not spam").

Building Your Own Ontology with Encord

Create New

First, navigate to the ‘Ontologies’ section of Encord and create a new ontology using the + New Ontology button. For this walkthrough, we will create an ontology for annotating humans for a semantic segmentation task.

Configure Ontology

Now, build your ontology structure by adding all the objects and classifications necessary for the creation of your dataset. 

As you can see, to annotate the types of clothes a human is wearing we have created a very detailed ontology. The objects with “*” denote the objects which are required to be annotated. This prioritization enhances data quality, annotation efficiency, and project scalability.

Utilizing Encord's Nested Classification feature, our ontology design extends beyond just identifying primary clothing items like "shirts," "pants," and "shoes." It dives into finer details, encompassing attributes such as "accessories" and "sleeve length." This hierarchical approach doesn't stop at surface categorizations; it goes deeper to capture nuanced attributes. The result is a highly informative and granular dataset, perfect for demanding computer vision tasks. Moreover, Encord offers the flexibility to create as many nested levels as your project requires, ensuring that your annotation process adapts seamlessly to the complexity of the data.

In the case of annotating videos, Dynamic Classification plays a pivotal role in enhancing the accuracy and granularity of labeling, through providing temporal accuracy, increased granularity, reduced annotation effort through automation and enhanced ML training via more informative training data.

Saving Ontology

While creating your ontology, you can preview the ontology being created. The ontology can also be previewed in the JSON format. This helps you ensure that the ontology structure accurately reflects your intended categorization and that all necessary objects, classifications, and attributes are correctly defined. JSON preview offers a convenient way to validate and share the ontology's structure with team members or stakeholders.

The saved ontology now can be found in the ontologies section. This ontology now can be attached to any number of annotation projects.

You can edit this saved ontology at a later time; however, it's important to note that any modifications made to the ontology will also result in changes to its structure across all attached annotation projects.

light-callout-cta Watch the video on creating an ontology with Encord or read the documentation for more information.

Using Ontology in Annotation Projects

Now, you can apply the ontology you've created within an annotation project.

Create a New Annotation Project

In the left-side navigation pane, find your annotation project or create a new project. 

For this demonstration, let’s create a new project. You can add a description to enhance clarity and understanding for all users on the project.

Add Dataset

You now have the option to either incorporate an existing dataset or generate a new dataset as needed. For detailed guidance on creating a dataset, please refer to the documentation.

Add Ontology

Now it's time to add the ontology you have created earlier. Here you can also create the ontology. 

Set up Quality Assurance

In this section, you'll be presented with a choice between two quality assurance options: Manual QA and Automated QA.

In Manual QA, you can configure the percentage of annotations that require manual review.

In the Automated QA mode, automated checks and validation processes are utilized by the system to evaluate annotation quality. The ontology acts as a reference and framework for these automated QA procedures, enabling validations that conform to the ontology's predefined structure. This integration not only simplifies the QA process but also guarantees consistency, precision, and adherence to project standards in annotations.

For this project, you can select manual QA. Keep in mind that, once the project is created, the QA mode cannot be switched!

Now Create Project!

Annotation Project

You can find the summary of your annotation project, the queued tasks, and who it is assigned to here.

Start labeling!

The highly detailed and well-structured ontology provides the capability to produce training data with a high degree of granularity. This attribute proves invaluable, especially in complex ML tasks, as it allows for the precise and detailed labeling of data. 

In such projects, where discerning subtle patterns and nuances is essential, the ability to create a comprehensive dataset, underpinned by ontology, becomes crucial. This detailed dataset serves as the bedrock for training ML models, equipping them to excel in tasks that demand an in-depth comprehension of intricate concepts. Consequently, this approach leads to more robust and insightful results in complex ML applications.

Usecases of Ontologies in Real-World ML Projects

Now, let’s discuss real-world case studies where the integration of in-depth ontologies has not only elevated model performance but also enhanced interpretability and generalization.

Medical Diagnosis and Treatment Personalization: Memorial Sloan Kettering Cancer Center

Challenge

Medical diagnosis is a multifaceted challenge, often requiring the consideration of various symptoms, patient history, and medical literature. Developing personalized treatment plans based on these factors demands precise understanding.

Solution

By constructing an ontology capturing medical concepts, symptoms, diseases, and treatment options, an ML model can be trained to comprehend the intricate relationships between these elements. The ontology not only aids in data preprocessing but also guides feature engineering by providing contextual information.

Impact

The model's enhanced accuracy is made possible by leveraging the intricate relationships between symptoms and diseases within the ontology, while also improving interpretability. This approach is supported by Encord's flexible ontology study, enabling:

  • Over 1000 protocol configurations.
  • A swift 10-minute setup time.
  • A 100% auditable process.
  • Seamless collaboration among team members.

light-callout-cta Read the Memorial Sloan Kettering Cancer Center case study here.
 

Sports Analytics with Agile Ontology Creation: Sports Tech Startup

Challenge

In sports analysis, it is important to detect key events, by detecting various objects and positions. Diversified data needs to be collected to build robust detection models.

Solution

Using an annotation tool that allowed the ML team to build new ontologies and annotate data at speed. This allowed the ML team to experiment and add new features for efficient sports analysis.

Impact

The flexibility of Encord's ontology-building capabilities enabled rapid experimentation, innovation, and cost-effective iteration, leading to more adventurous development and enhanced sports analytics outcomes.

light-callout-cta Read this case study where adopted Encord to improve their sports analysis: Rapid Annotation & Flexible Ontology for a Sports Tech Startup
 

Wrapping up

In conclusion, ontologies are essential to the success of annotation projects, ensuring structured data organization and consistent labeling. Encord empowers users to create complex and nested ontologies, with Dynamic Classification adding real-time adaptability to video annotations.  Complex ontologies are invaluable in hard ML tasks, as evidenced by real-world applications in medical diagnosis and sports analytics, where they enhance accuracy and interpretability. It's a tool that paves the way for more insightful and robust results in the realm of complex machine learning applications.

Ready to elevate your machine learning annotation projects with Encord's powerful ontology-driven approach? Experience the difference for yourself – Try Encord now!

Scale your annotation workflows and power your model performance with data-driven insights
medical banner

author-avatar-url
Written by Akruti Acharya
Akruti is a data scientist and technical content writer with a M.Sc. in Machine Learning & Artificial Intelligence from the University of Birmingham. She enjoys exploring new things and applying her technical and analytical skills to solve challenging problems and sharing her knowledge and... see more
View more posts
cta banner

Build better ML models with Encord

Get started today
cta banner

Discuss this blog on Slack

Join the Encord Developers community to discuss the latest in computer vision, machine learning, and data-centric AI

Join the community

Software To Help You Turn Your Data Into AI

Forget fragmented workflows, annotation tools, and Notebooks for building AI applications. Encord Data Engine accelerates every step of taking your model into production.