Back to Case Studies
Case Studies

How Surgical Data Science Collective (SDSC) Conducted Video Annotation 10x Faster

September 5, 2023
5 mins


The Surgical Data Science Collective (SDSC) faced challenges in efficiently annotating vast amounts of surgical video data, encountering issues with latency, annotation quality, and integration into their wider data pipeline. They needed a solution that could handle large datasets, ensure annotation precision, and seamlessly integrate with their existing workflows.


Key Result

With Encord's comprehensive Training Data Platform, SDSC achieved a significant 10x increase in annotation speed while moving towards a goal of 0% annotation errors, down from a previous rate of 20%. With Encord's support for video annotation and automated label error detection, SDSC streamlined their annotation processes, enhancing efficiency and annotation accuracy. This improvement enabled SDSC to confidently undertake projects involving the annotation of 100 hours of surgical procedures within four months, representing a substantial increase in productivity compared to their previous capabilities.

Surgical Data Science Collective (SDSC) is a data platform that provides surgeons with access to data and quantitative insights about procedures to expedite the training process and democratize access to safe surgery. Working with Encord, SDSC has increased the speed of annotations by 10x while simultaneously improving precision and accuracy.



SDSC is a non-profit organization dedicated to transforming surgery from an art to a science. With five core products, SDSC provides essential metrics on various procedures once videos are uploaded to the platform. For instance, the Kinematics model captures the movement of specific tools during a surgical procedure.

As Director of Machine Learning (ML), Margaux Masson-Forsythe is responsible for leading the ML roadmap at SDSC, defining the strategy of generating high-quality training data, managing the data pipeline, and overseeing the ML team.


A vast amount of video data requires technical knowledge for annotation and a need to connect their training data platform to a wider pipeline.

Before switching to Encord, the SDSC team faced three common problems: quantity of data, poor quality of annotations, and a lack of customizability and integrations.

Firstly, they faced a challenge in dealing with the vast amount of video data that required annotation. With each procedure split into 20 clips and each clip lasting approximately 15 minutes, the team had several Terabytes of data to annotate. Their previous tool suffered from a lot of latency issues when rendering videos, which hindered the labelers’ ability to effectively annotate at speed.

Secondly, the team discovered that around 20% of the annotations they had previously conducted were incorrect, with most of these coming in the form of inconsistent naming conventions on the same objects. Using Encord Active’s automated label error detection feature, the team could identify these errors that they attributed to: i) the absence of a robust annotation toolkit and ii) the requirement for a high level of technical knowledge and expertise to conduct and review annotations.

Lastly, the team had difficulty programmatically interacting with their training data platform and integrating it into their wider model production pipeline. They needed a working and usable Python SDK to create automated training data pipelines.


Leveraging Encord’s comprehensive Training Data Platform to conduct video annotations with state-of-the-art tools and unparalleled support.

After reviewing several solutions, the SDSC team chose to integrate Encord into their data pipelines. On the onboarding process, Margaux noted “Getting started with Encord and integrating it into our workflow was really fast. The thing that I find the most valuable is the flexibility of how we can integrate the Encord pipeline into our own pipeline, we use the Python SDK a lot”.

By natively rendering videos in the Encord platform, SDSC’s team was able to speed up annotation while increasing precision. Margaux praised the platform's support for video annotation, noting “How smooth [Encord Annotate] was and all of the different tools that come with labeling videos” and that “[Encord] definitely was the best platform we’ve seen and we were looking around different platforms”.

In order to solve their issues with incorporating expert review into their annotation workflows, the team used Encord Annotate's workflows to automate review by their labeling manager. Margaux explained that with Encord “We have a better reviewing system [...] that is the key component to having better quality datasets that we were missing before”. This allowed the annotators to get up to speed with complex annotations a lot quicker, without requiring experts to conduct annotations themselves.

Margaux praised Encord's analytics capabilities, noting that “Now I have this whole system where I get analytics from Encord and we’re going to populate that into a dashboard so we can see how the annotation is going up”. She also appreciated how quick Encord was to incorporate Meta’s Segment Anything Model (SAM) into the platform, stating “One feature that made me go with Encord was the integration of SAM in the [Encord Annotate] platform which was done really quickly after the model was released so I knew when there was a new computer vision model released it will be integrated into the platform quite fast - which is something that was also a really good point”.


10x faster annotation whilst moving towards 0% annotation errors (previously 20%)

After integrating Encord into their wider data pipelines, SDSC was able to produce high-quality training data with quick annotations. With the help of Encord Active, the team identified that approximately 20% of the annotations completed on the previous tool were incorrect. The team is now “aiming to have 0% bad annotations” with the use of Encord’s platform.

Margaux discussed an upcoming project where SDSC would be annotating 100 hours of procedures (20 procedures at 5 hours per procedure) in four months and she expressed confidence in their ability to complete the task with Encord, in conjunction with their wider Active Learning pipeline. According to Margaux,“... we know we can do that now with Encord because of the whole process that we have, which compared to what we had before, it would be maybe one procedure every two months even, much slower”. This represents a 10x increase in efficiency, as SDSC would have previously been able to annotate only around 10 hours (2 procedures) in the same time frame. 

As SDSC continues to grow and increase model production, they will further scale their use of Encord Annotate in addition to building out more mature Active Learning pipelines using Encord Active, given their initial success with the automated label error detection feature.

Frequently asked questions
  • Yes. In addition to being able to train models & run inference using our platform, you can either import model predictions via our APIs & Python SDK, integrate your model in the Encord annotation interface if it is deployed via API, or upload your own model weights.

  • At Encord, we take our security commitments very seriously. When working with us and using our services, you can ensure your and your customer's data is safe and secure. You always own labels, data & models, and Encord never shares any of your data with any third party. Encord is hosted securely on the Google Cloud Platform (GCP). Encord native integrations with private cloud buckets, ensuring that data never has to leave your own storage facility.

    Any data passing through the Encord platform is encrypted both in-transit using TLS and at rest.

    Encord is HIPAA&GDPR compliant, and maintains SOC2 Type II certification. Learn more about data security at Encord here.

  • Yes. If you believe you’ve discovered a bug in Encord’s security, please get in touch at Our security team promptly investigates all reported issues. Learn more about data security at Encord here.

  • Yes - we offer managed on-demand premium labeling-as-a-service designed to meet your specific business objectives and offer our expert support to help you meet your goals. Our active learning platform and suite of tools are designed to automate the annotation process and maximise the ROI of each human input. The purpose of our software is to help you label less data.

  • The best way to spend less on labeling is using purpose-built annotation software, automation features, and active learning techniques. Encord's platform provides several automation techniques, including model-assisted labeling & auto-segmentation. High-complexity use cases have seen 60-80% reduction in labeling costs.

  • Encord offers three different support plans: standard, premium, and enterprise support. Note that custom service agreements and uptime SLAs require an enterprise support plan. Learn more about our support plans here.

Explore our products