How Automotus increased mAP 20% by reducing their dataset size by 35% with visual data curation

Ulrik Stig Hansen
December 14, 2023
5 min read
blog image

Introducing Automotus

Automotus is a leading technology startup, building curbside management automation solutions. Their team has rapidly grown their reach and customer base, raising a $9M seed round in 2022, and now working with cities, airports, and fleets all over the world to improve urban mobility – reducing congestion, emissions, and traffic hazards, optimizing curbside utilization, and monetizing various loading activities.

Their software enables customers to better understand the curb, simplify payments, and enforce relevant regulation. To do so, the Automotus team has built cutting-edge computer vision technology to capture real-time data from strategically-mounted street cameras. 

We sat down with Prajwal Kotamraju, Co-Founder and Head of Computer Vision at Automotus, to discuss their journey to where they are today, his work overseeing the product roadmap, and the exciting plans ahead for the business.

The early days of Automotus with Annotate

One of the team’s first priorities was managing the infrastructure constraints, as requirements and availability varied heavily depending on the location that the cameras were placed in. After some trial and error, the team had a network of cameras up and running, and started working on converting the vast amount of de-identified images into labeled training data.

The model needed to identify, locate, and classify vehicles, analyze movement to understand traffic flows, and much more. Managing different fields of view, conditions, and labeling approaches was paramount.

After evaluating a few other tools, the Automotus team decided to partner with Encord.

Encord was a much better fit for us than the other tools we tried in 3 main areas: 

  1. The flexible ontology structure: It allowed us to easily classify and track objects, and also keep our ontology very straightforward as complexity increased. It also allows us to easily train both detection and classification models from a single project, which we couldn’t do as easily in any other tool.
  2. The quality control capabilities: The quality of labels is always one of the biggest obstacles, and tasks like identifying small objects in frames, which we had a lot of, are notoriously difficult. The human-in-the-loop feedback mechanisms in Encord, which allow us to semi-automatically train, manage, and improve annotation performance, were extremely helpful. 
  3. The automated labeling features: It’s great to be able to label a couple of frames and use the assisted labeling features to speed up the process and not have to label every frame manually.

On the topic of labeled data, Prajwal added: “For example, a shortcoming with other tools was the quality of the labels: we’d occasionally realize bounding boxes would be tighter or too wide around the objects they were identifying, or objects wouldn’t be classified correctly within frames. Now, we can select the sampling rate of frames that we want to move towards a review process, and share real-time context with annotators so that they can also power our model performance. This human-in-the-loop approach means we can use Encord to help our annotators perform annotations better, which in turn speeds up how quickly we can improve our model performance. We are able to localize objects better and increase accuracy.”

But most importantly, the high model accuracy enabled Automotus to better serve and grow with their customers – presenting more accurate data to their clients: “From the modality distribution that happens at the street-level, to more accurate representations of the dwell times [and a few other metrics that we supply to our clients] – these base models are the ones driving these analytics.”

The next phase of expansion with Active


Having set up a continuous, iterative annotation pipeline, the Automotus team turned their attention to the next question: Out of all the data collected, how could they ensure they were labeling the right data? And how would they know what data that is?

Capturing large collections of de-identified images from hundreds of cameras means there are large troves of data available, but labeling all the data doesn’t necessarily lead to improvements in the model performance, and is expensive, so they sought to identify which data actually drove the most results.

“One of our bigger problem areas is that we’d be collecting a ton of data in each location. We wouldn't want all this data to be labeled because not all of the data we are collecting would improve the model performance. So understanding which data we have to curate is very critical, and that's where Encord Active came in. It helped us reduce the amounts of data we had to label to improve the model performance.”

Using Encord Active, the team was able to visually inspect, query, and sort their datasets to remove unwanted and poor quality data with just a few clicks, leading to a 35% reduction in the size of the datasets for annotation. In turn, this enabled the team to cut their labeling costs by over a third.

We now have an integrated, one-stop solution where we can manage our data and also understand our model performance to create feedback mechanisms to improve data and models.”

💡 Automotus’ work with Encord Active led to a substantial improvement in model performance with their mAP improving by around 20%. "At the same time, our labeling costs will not scale linearly with the number of locations that we are operating in going forward, which is a major advantage."

With the improved models, the team were able to successfully extend pilot programs, expanding to more locations and improving the quality of the data they can present to clients.”


.. & onwards!

Over the last 3 years, the Automotus team has built an industry-leading product that serves hundreds of municipalities, fleets, and airports all over the world. The team has been able to rapidly grow with their customers – increasing their availability geographically, and simplifying smart loading zones further – and it’s been incredible to see their journey to this day.

We’re so proud to work with Prajwal and the Automotus team and we look forward to all that’s ahead! 

Written by Ulrik Stig Hansen
Ulrik is the President & Co-Founder of Encord. Ulrik started his career in the Emerging Markets team at J.P. Morgan. Ulrik holds an M.S. in Computer Science from Imperial College London. In his spare time, Ulrik enjoys writing ultra-low latency software applications in C++ and enjoys exper... see more
View more posts

Think Encord could be a good fit for your team as well?

Book a demo

Software To Help You Turn Your Data Into AI

Forget fragmented workflows, annotation tools, and Notebooks for building AI applications. Encord Data Engine accelerates every step of taking your model into production.