How to Improve the Accuracy of Computer Vision Models
Accuracy is crucial when training computer vision models. Accuracy rests on three core pillars:
- The quality, volume, and cleanliness (how clean it is) of the imaging or video datasets that will be annotated, labeled, and used in a computer vision model;
- The experimentation and training process used to train a computer vision (CV) or machine learning (ML) model;
- The workflow, annotation tools, automation features, dashboard, quality control (QC), and quality assurance (QA) processes can have a huge positive impact on iterative training outcomes when building an algorithmic-based model.
In this article, we bring together the most effective best practice guidelines for those who are training computer vision models and are tasked with improving the accuracy and performance of the models, to get them from proof of concept (POC) to a working production-ready model.
How to Source Datasets for Computer Vision Models?
As we’ve covered in previous articles, there are numerous ways to source datasets for computer vision models. You can use your own data if you have it, or you can go out and find an open-source dataset that is ready to feed into a CV or ML-based model.
If you are looking for open-source image or video datasets, there’s a wealth of options for an extensive range of sectors in this article: where to find the best open-source datasets for machine learning models, including where you can find them depending on your sector/use cases.
Open-source datasets for computer vision models aren’t difficult to source, and they’re free!
It’s more challenging finding proprietary datasets that you can buy or source cheaply, especially when large volumes of data are needed to train an artificial intelligence model.
It’s even more difficult to get these in the medical and healthcare sector because even though hospitals and medical providers sell data, it needs to meet data compliance requirements and be free from individual patient identifiers (or these need to be scrubbed from the images or videos during the cleaning process).
However, once you’ve got data you can use, there’s a detailed process to work through before those datasets can get anywhere near a production-ready computer vision model. The process involves:
- Data cleaning;
- Labeling and annotating the images or videos in the dataset (automation and having the right tools can accelerate this crucial part of the process);
- Experiments and training using the annotated datasets;
- Sufficient iterations on the datasets and during the model training process to put a POC model into production, to start generating the results and outcome the project needs.
Before any of that can happen, the first part of the process is data cleaning. A thankless and labor-intensive task that every dataset needs to go through before annotation and labeling work can start. Even if you use an open-source dataset, a certain amount of cleaning is usually required.
Why is Data Cleaning Crucial for Machine Learning Experiments and Training?
Clean data is essential for successful computer vision and machine learning experiments, training, and models.
Unclean data is expensive, costing time and money. According to an IBM estimate published in the Harvard Business Review (HBR), unclean and poor-quality data costs the world $3.1 trillion.
Cleaning data contained within spreadsheets costs tens of thousands of dollars. Whereas, cleaning image and video-based data costs even more, as the work is considerably more time-consuming, and getting it right the first time is essential if you want to produce an accurate computer vision model.
To avoid challenges further down the road, you need to clean your video or image data before using it to train your machine learning model. One way you can do this is by matching your dataset against a well-known open-source dataset that includes images of similar objects. When your data has been bought or sourced for a project a certain amount of data cleaning is usually necessary. The trick is to automate this as much as possible to reduce the time and cost of data cleaning.
Cleaning images involves removing duplicate or corrupt files, and enhancing or reducing the brightness and pixelation of images. Medical images are more complex to clean as there are numerous layers to file formats (such as DICOM). And then when it comes to videos, you’ve got to remove and tidy up corrupted files, duplicate frames, ghost frames, variable frame rates, and other sometimes unknown and unexpected problems.
Once the images or videos are ready, and the annotation and labeling work has started, a quality control (QC) and quality assurance (QA) workflow process are mission-critical to ensure the quality and accuracy of the labels before you can start training a computer vision model.
How to Improve Dataset Annotation and Labels for Greater Accuracy
In computer vision, dataset annotation and labeling are critical part of the process. It’s often said that you can have the best algorithm in the world but if your dataset lacks quality and volume then your machine-learning model will suffer.
When creating datasets ready for training and machine learning experiments, you need to ensure they’re diverse enough to reflect every aspect of the variety of objects within the dataset, to reduce bias
For example, if you want to create an annotation label for types of cars, don't just include pictures of Lamborghinis and Ferraris — you need images with numerous different and relevant makes, models, and colors so that your algorithm can learn how to identify cars accurately regardless of their color, make, or model.
Having the right tools for dataset annotation and labeling improves the accuracy, annotation process, and project outcomes. Tools such as Encord gives data annotation teams the label and annotation formats they need, the ability to upload files in a native format and give project leaders the overview and workflow features they require to ensure a project runs smoothly.
It’s especially useful in medical imaging or other specialist settings to have a tool that is built for and works well with native file formats, such as DICOM and NIfTI. Encord has developed our medical imaging dataset annotation software in close collaboration with medical professionals and healthcare data scientists.
Labels and annotations need to be run through a quality control process before experiments and training can start. Otherwise, you risk putting poor-quality data into a model that will only generate poor-quality results.
Next, you need to run experiments to train your computer vision model to improve performance and accuracy.
Why Do You Need to Run Experiments for Computer Vision Models?
Experiments are an integral part of creating and building working computer vision models. Experiments are used to:
- Improve performance: You will need to improve model performance by running experiments and analyzing its results.
- Improve the model: You can use an experiment to improve your model by gathering data about its behavior and changing it accordingly, making it more accurate, robust, or efficient at solving a particular problem.
- Improve the training dataset: By running an experiment on a range of labeled images with different classes (e.g., cats vs dogs), one could gather information about how well each annotation and label class works when given different datasets as training inputs. For example, you might need more images under different light conditions, showing daytime and nighttime images, and different breeds of cats and dogs.
How to Train Your Model to Increase Performance and Accuracy
The next step is to train your model and assess its performance. When you’re training a model, it will learn from the data you feed into it.
Failure is an inevitable and necessary part of the training process for machine learning and computer vision models. To start with, expect an accuracy rating of around 70%. Don’t worry. The aim is to keep iterating, keep improving the volume of data, and labels and annotations within the images and videos until the accuracy rating reaches 90%+. It will happen. It takes time, but your ML or data ops team will get there.
You can also use a benchmarking dataset for evaluation purposes—this means that after training your model, you run it against a benchmark dataset to see how well your computer vision model performs compared with what was expected for accuracy and the false positive rate.
Do You Need to Create Artificial Images or Video Content?
Artificially-generated content can help test the algorithm because it allows you to see how well it performs when presented with different situations or scenarios in which there are no (or not enough) real-world examples available from which it can learn from.
For example, you might not have enough images or videos of car crashes, and yet that’s what you need for your ML model. What can you do?
You can source artificially-generated content in several ways. It’s especially useful when the volume of images or videos for a particular use case won’t be enough to accurately train a computer vision model.
Computer-generated images (CGI) 3D games engines — such as Unity and Unreal — and Generative adversarial networks (GANs) are ideal sources for creating images or videos that are high-quality enough to train a CV model. Quality and quantity are important factors; hence the need to use artificial or synthetic images and videos to train computer vision models.
Now let’s take a closer look at how to improve computer vision model experiment workflows.
How to Improve Computer Vision Model Experiment Workflows
Improving the accuracy of your computer vision model is not only about understanding what works, but also how to improve the process of experimenting with different machine learning models and parameters. The best way to do this is by using tools that allow you to quickly try out new ideas and test them on a dataset.
With tools such as Encord and Encord Active, you can quickly improve the quality of labels and annotations, and the associated workflow and quality process management. Using a dashboard, data ops managers can oversee annotation and training workflows more effectively, ask for more accurately labeled datasets, introduce data augmentation, and reduce bias.
Now it’s simply a case of training and re-training the model until the desired results are being achieved consistently, and then you can put a working model into production to solve the problem that needs solving.
Ready to improve your computer vision workflows?
Sign-up for an Encord Free Trial: The Active Learning Platform for Computer Vision, used by the world’s leading computer vision teams.
AI-assisted labeling, model training & diagnostics, find & fix dataset errors and biases, all in one collaborative active learning platform, to get to production AI faster. Try Encord for Free Today.
Want to stay updated?
Follow us on Twitter and LinkedIn for more content on computer vision, training data, and active learning.