Back to Case Studies
Case Studies

How Harvard Medical School and MGH Cut Down Annotation Time and Model Errors with Encord

January 18, 2024
5 mins


Manual Arterial Duplex Ultrasound (DUS) image annotation was time-consuming and error-prone for annotators and radiologists. Manual DUS image annotation was also heavily dependent on expertise and experience.


Key Results

With Encord, the researchers and radiologists created their first segmentation models by labeling only a handful of images. They successfully reduced annotation with segmentation models by 10x. They used Encord Active to evaluate the performance of their segmentation models and determine how to improve them.


A new paper published in MDPI (Multidisciplinary Digital Publishing Institute) demonstrates how, using the Encord platform, researchers at Harvard Medical School, Massachusetts General Hospital, and Brigham and Women’s Hospital were able to reduce vascular ultrasound annotation time from days to minutes and run automated analyses of their datasets.

Using Encord, the team was able to:

  • Create their first segmentation models by labeling only a handful of images
  • Cut annotation time through segmentation models by an order of magnitude
  • Visually explore their dataset and identify problematic areas - in their case, the impact of blur on their dataset
  • Evaluate the performance of their segmentation models in the Encord platform 

Problem: DUS Image Annotation is Resource-Intensive and Prone to Human Error

Medical imaging, particularly Arterial Duplex Ultrasound (DUS), plays a crucial role in diagnosing and managing vascular diseases like Popliteal Artery Aneurysms (PAAs). The traditional method of analyzing DUS images relies heavily on manual annotation by skilled medical professionals.

This process is fraught with challenges:

  • Time-consuming—especially with the growing volume of medical imaging data.
  • Prone to human error.
  • Heavily dependent on expertise and experience - furthering how resource-intensive the process becomes

The subjective nature of manual annotations can lead to inconsistent measurements and interpretations due to inter- and intra-observer variability during annotation. This raises concerns about the reliability and reproducibility of the results and could impact the accuracy of diagnoses and treatment plans for patients.

The primary issue in this research paper lies in precisely annotating the inner and outer lumens of the artery in images - a critical step for accurate measurement and subsequent treatment planning.

Solution: Encord Annotate to Auto-Label DUS Images and Encord Active to Validate Model Performance

The study tested the feasibility of the Encord platform to create an automated model that segments the inner and outer lumen within PAA DUS. Using image segmentation to find the largest diameter and thrombus area within PAAs helped standardize DUS measurements that are important for making decisions about surgery. 

Using Encord Annotate for Automated Annotation

The researchers collected and prepared (deidentification and extraction) a dataset comprising DUS images of PAAs for upload to Encord before annotating a few images to serve as ground truth for the annotation models using Encord Annotate.

Using Encord Annotate’s automated labeling feature, they could generate segmentation masks for unlabeled images. This reduced the time and effort required for DUS image analysis while minimizing the potential for human error. 

Using Encord Active to Select the Best-Performing Model

They trained three models and validated them with Encord Active on the annotated images (20, 60, and 80 sets). Encord Active enabled the researchers to understand the performance metrics that helped them select the best model for segmenting the inner and outer lumens of the popliteal artery with high precision. 

light-callout-cta After training models on image subsets, we tested them within the Encord platform. We selected the desired tests in the analysis tab of the project, and after a runtime period, the platform presented calculations of true positives, false negatives, mAP, IoU, and blur.

The report referenced Encord’s ability to seamlessly integrate into clinical processes with a user-friendly interface, simple onboarding, and rapid annotation workflows as crucial to the study's success. For healthcare practitioners who use the platform, this improves their diagnostic process without disrupting established procedures.


Encord Reduced Annotation Time from Days to Minutes

Where manual annotation could take several minutes per image, the researchers accomplished the task in a fraction of the time using Encord. Their workflow went from relying on RPVI-certified physicians manually annotating DUS images that took days to use Encord to annotate a few images, train models, and auto-label unlabeled images in minutes. 

This efficiency proves crucial in clinical settings, where timely diagnosis and treatment decisions can significantly impact patient outcomes.

blog image

Figure 1. AI segmentation classifications on duplex ultrasound images. (A) Outer polygon true-positive classification, where the color green indicates a correct segmentation. (B) Outer polygon false-positive classification, where red indicates an incorrect segmentation. (C) Inner polygon true-positive classification, where the color green indicates a correct segmentation. (D) Inner polygon false-positive classification, where red indicates an incorrect segmentation.

Better Evaluation and Observability of Model Performance with Encord Active

The researchers quantitatively assessed the performance of the three models with Encord Active providing analytics on the following metrics: 

  • mean Average Precision (mAP). 
  • Intersection over Union (IoU).
  • True Positive Rate (TPR).

Encord Active calculated the outer polygon's mAP to be 0.85 for the 20-image model, 0.06 for the 60-image model, and 0 for the 80-image model. The mAP of the inner polygon was 0.23 for the 20-image, 60-image, and 80-image models. The true-positive rate (TPR) for the inner polygon remained at 0.23. See the full results in the table below:

blog image

“With regard to the models for outer and inner polygons, the outer polygon model

outperformed the inner polygon model on every metric. The outer polygon demonstrated almost equal precision and recall at 0.85. The mAP for the outer polygon model was 0.85 with a true-positive rate of 0.86, which is comparable to other clinically used high-performing models for US segmentation.”

With Encord Active automatically providing model evaluation analytics, the team instantly discovered the model's strengths and weaknesses. For every model they trained, Active provided breakdowns and graphs on its performance, including the ability to visually explore the regions the model incorrectly segmented vs. the ground truth.

Encord Active Uncovered Blurry DUS Images that Could Degrade Annotation Model Performance

The researchers used Encord Active to explore the model's performance depending on the blur level, allowing them to visually interact with varying levels of blur in their dataset to understand how this impacted model performance.

The paper states, “Intuitively, our analysis found that as the images became blurrier, the model precision declined, and false-negative rates increased... Removing blur from—or augmenting—blur in images can be important for training accurate AI models.”


light-callout-cta In summary, the platform’s intuitive navigation, complemented by tutorials for both model training and analysis, allowed for straightforward operationalization of the model training system among members of the research team. The results were displayed in an understandable format and interpreted within the following discussion.

The findings have far-reaching consequences for medical imaging and diagnosis. The researchers greatly improved the accuracy, reliability, and efficiency of DUS image analysis by auto-annotating images with Encord Annotate and validating annotation models with Encord Active. This could result in potentially better patient care, treatment planning, and diagnostic procedures.

At Encord, we are committed to continually providing healthcare practitioners and physicians with the data-centric AI platform they need to improve their medical imaging and analysis workflows. 

We’re proud of the work the researchers were able to accomplish and how Encord is paving the way for broader applications of AI in various aspects of medical diagnostics.  

light-callout-cta 📑 Read the full paper on MDPI (Multidisciplinary Digital Publishing Institute).

Think Encord could be a good fit for your team as well?

Book a demo
Frequently asked questions
  • Yes. In addition to being able to train models & run inference using our platform, you can either import model predictions via our APIs & Python SDK, integrate your model in the Encord annotation interface if it is deployed via API, or upload your own model weights.

  • At Encord, we take our security commitments very seriously. When working with us and using our services, you can ensure your and your customer's data is safe and secure. You always own labels, data & models, and Encord never shares any of your data with any third party. Encord is hosted securely on the Google Cloud Platform (GCP). Encord native integrations with private cloud buckets, ensuring that data never has to leave your own storage facility.

    Any data passing through the Encord platform is encrypted both in-transit using TLS and at rest.

    Encord is HIPAA&GDPR compliant, and maintains SOC2 Type II certification. Learn more about data security at Encord here.

  • Yes. If you believe you’ve discovered a bug in Encord’s security, please get in touch at Our security team promptly investigates all reported issues. Learn more about data security at Encord here.

  • Yes - we offer managed on-demand premium labeling-as-a-service designed to meet your specific business objectives and offer our expert support to help you meet your goals. Our active learning platform and suite of tools are designed to automate the annotation process and maximise the ROI of each human input. The purpose of our software is to help you label less data.

  • The best way to spend less on labeling is using purpose-built annotation software, automation features, and active learning techniques. Encord's platform provides several automation techniques, including model-assisted labeling & auto-segmentation. High-complexity use cases have seen 60-80% reduction in labeling costs.

  • Encord offers three different support plans: standard, premium, and enterprise support. Note that custom service agreements and uptime SLAs require an enterprise support plan. Learn more about our support plans here.

Explore our products