Back to Blogs

The 10 Computer Vision Quality Assurance Metrics Your Team Should be Tracking

June 12, 2023
4 mins
blog image

Building a computer vision monitoring solution requires careful attention to detail and a robust quality assurance (QA) process. As computer vision models continue to advance, it becomes increasingly crucial to track key metrics that provide insights into the performance and effectiveness of your algorithms, datasets, and labels.

Quality metrics help you identify potential issues but also enable you to make data-driven decisions to improve your computer vision algorithms and ensure optimal functionality. 

In this blog post, we will explore:

  • Image Width, Height, Ratio & Area Distribution
  • Robustness to Adversarial Attacks
  • AE Outlier Score
  • KS Drift
  • Motion Blur
  • Optical Distortion
  • Lack of Dynamic Range
  • Color Constancy Errors
  • Tone Mapping
  • Noise Level

10 Computer Vision Quality Assurance Metrics

These essential metrics must be monitored when developing a computer vision algorithm. You can maintain high-quality results by keeping a close eye on these metrics.

Image Width, Height, Ratio & Area Distribution

The image dimensions are an important quality metric for computer vision models. By analyzing the image dimensions, we can gain insights into the characteristics of the image dataset. 

The metric helps assess the consistency of image sizes within the dataset. Consistent image dimensions can be advantageous for computer vision models as they allow for uniform processing and facilitate model training.

Aspect ratio ratio of image width to height) analysis provides information about the shape or orientation of objects in the images. Understanding the aspect ratio distribution can benefit tasks such as object detection or object recognition, where object proportions play a significant role.

Aspect ratio is used as a data quality metric on the Encord Active platform.

These pieces of information can guide preprocessing choices, highlight potential data issues, and inform the design and training of computer vision models to handle images of varying sizes and aspect ratios effectively.

Robustness to Adversarial Attacks

Adversarial attacks involve intentionally manipulating or perturbing input data to deceive computer vision models. Robustness to such attacks is crucial to ensure the reliability and security of computer vision algorithms. Evaluating the robustness of a model against adversarial attacks can be done by measuring its susceptibility to perturbations or by assessing its ability to classify or detect adversarial examples correctly.

Adversarial examples generated for AlexNet. The left column indicates the sample image, the middle is the perturbation, and the right is the adversarial example. All images in the right column are predicted as an “ostrich.”


Metrics such as adversarial accuracy, fooling rate, or robustness against specific attack methods can be used to quantify the computer vision models’ resistance to adversarial manipulations. Enhancing the robustness of the models helps protect them against potential malicious attacks, improving their overall quality and trustworthiness.

AE Outlier Score

The AE (Autoencoder) Outlier score is a valuable quality metric that helps evaluate the performance and robustness of computer vision models. Specifically, it measures the discrepancy or abnormality between the original input data and the reconstructed output generated by an autoencoder-based anomaly detection algorithm. 

Encord Active’s platform shows metrics that contain outliers in the Caltech Dataset.

In anomaly detection tasks, the AE outlier score quantifies the extent of dissimilarity between an input sample and the expected distribution of normal data. A higher outlier score indicates a larger deviation from the norm, suggesting the presence of an anomaly or outlier.

By monitoring the AE outlier score, several benefits can be achieved. Firstly, it enables the identification of abnormal patterns or irregularities in the input or output of the computer vision model. This helps in detecting errors or outliers that may affect the model’s performance or output accuracy.

The AE outlier score also helps in determining appropriate threshold values for distinguishing normal and abnormal data. By analyzing the distribution of outlier scores, suitable thresholds can be set, striking a balance between false positives and false negatives, thereby optimizing the anomaly detection performance.

KS Drift

The Kolmogorov-Smirnov (KS) drift measures the drift or deviation in the distribution of predicted possibilities or scores between different time points or data subsets using the KS test.

By monitoring the KS drift, computer vision engineers can identify potential issues such as concept drift, dataset biases, or model degradation. Concept drift refers to the change in the underlying data distribution over time, which can lead to a decline in model performance. Dataset biases, on the other hand, can arise from changes in the data collection process or sources, impacting the model's ability to generalize to new samples.

💡Please read the paper which inspired the KS Drift metric for more insight

The KS drift metric allows teams to detect and quantify these changes by comparing the distribution of model outputs across different time periods or data subsets. An increase in the KS drift score indicates a significant deviation in the model’s predictions, suggesting the need for further investigation or model retraining. 

Motion Blur

Long-exposure photos of the night sky exhibit motion blur caused by the Earth's rotation. This continuous rotation throughout the day results in star trails being captured.


Motion blur is a crucial quality metric for assessing the performance of computer vision models, particularly in tasks involving object recognition and motion analysis. It measures the level of blurring caused by relative motion between the camera and the scene during image or video capture.

Motion blur can negatively impact model accuracy and reliability by introducing distortions and hindering object recognition. Monitoring and mitigating motion blur are essential to ensure high-quality results.

Effect of motion blur on hidden layers of CNN.


By analyzing motion blur as a quality metric, techniques can be employed to detect and address blurry images or frames. Blur detection algorithms can filter out affected samples during model training or inference. Motion deblurring techniques can also be applied to reduce or remove the effects of motion blur, enhancing the sharpness and quality of input data.

Considering motion blur as a quality metric allows for improved model performance and robustness in real-world scenarios with camera or scene motion, leading to more accurate and reliable computer vision systems.

Scale your annotation workflows and power your model performance with data-driven insights
medical banner

Optical Distortion

Optical distortion refers to aberrations or distortions introduced by optical systems, such as lenses or cameras, that can impact the accuracy and reliability of computer vision algorithms.

Various types of optical distortion, such as barrel distortion, pincushion distortion, or chromatic aberrations, can occur, leading to image warping, stretching, color fringing, and other artifacts that impact the interpretation of objects. 

Effect of optical distortion on hidden layers of CNN.


By considering optical distortion as a quality metric, computer vision practitioners can address and mitigate its effects. Calibration techniques can be employed to correct lens distortions and ensure accurate image measurements and object localization. Understanding and quantifying optical distortion allows for appropriate preprocessing strategies, such as image rectification or undistortion, to compensate for the distortions and improve the accuracy of computer vision models.

Limited Dynamic Range

Dynamic range refers to the ratio between the brightest and darkest regions of an image. A higher dynamic range allows for capturing a wider range of luminance values, resulting in a more detailed and accurate representation of the scene. Conversely, a lack of dynamic range implies that the dataset has limitations in capturing or reproducing the full range of luminance values.

Left: LDR image. Right: reconstructed HDR image.


Working with images that have a limited dynamic range can pose challenges for computer vision models. Some potential issues include loss of detail in shadows or highlights, decreased accuracy in object detection or recognition, and reduced performance in tasks that rely on fine-grained visual information. 

Color Consistency Errors

Color consistency errors refer to inaccuracies or discrepancies in perceiving and reproducing colors consistently under varying lighting conditions. Color consistency is a vital aspect of computer vision algorithms, as they aim to mimic human vision by interpreting and understanding scenes in a manner that remains stable despite changes in illumination.

The effect of correct/incorrect computational color constancy (i.e., white balance) on (top) classification results by ResNet; and (bottom) semantic segmentation by RefineNet.

Color consistency errors can arise due to various factors, including lighting conditions, camera calibration, color space conversions, or color reproduction inaccuracies. These errors can lead to incorrect color interpretations, inaccurate segmentation, or distorted visual appearance.

By considering color consistency errors as a quality metric, computer vision practitioners can evaluate the accuracy and reliability of color-related tasks. Techniques such as color calibration, color correction, or color normalization can be employed to reduce these errors and improve the model's performance.

Scale your annotation workflows and power your model performance with data-driven insights
medical banner

Tone Mapping

Tone mapping is a quality metric that plays a crucial role in assessing the performance of computer vision models, particularly in tasks related to high-dynamic-range (HDR) imaging or display. It measures the accuracy and effectiveness of converting HDR images into a suitable format for display on standard dynamic range (SDR) devices or for human perception.

Tone mapping is necessary because HDR images capture a wider range of luminance values than what can be displayed or perceived on SDR devices. Effective tone mapping algorithms preserve important details, color fidelity, and overall visual quality while compressing the dynamic range to fit within the limitations of SDR displays.

Tone-mapped LDR images.


By considering tone mapping as a quality metric, computer vision practitioners can evaluate the model's ability to handle HDR content and ensure that the tone-mapped results are visually pleasing and faithful to the original HDR scene. Evaluation criteria may include preservation of highlight and shadow details, natural-looking color reproduction, and avoidance of artifacts like halos, posterization, or noise amplification.

Noise Level

Noise refers to unwanted variations or random fluctuations in pixel values within an image. High noise levels can degrade the quality of computer vision algorithms, affecting their ability to detect and analyze objects accurately. Monitoring the noise level metric helps ensure that the images processed by your computer vision model have an acceptable signal-to-noise ratio, reducing the impact of noise on subsequent image analysis tasks.

The displayed image represents the output of the soft-max unit, reflecting the network's confidence in the correct class. As image quality deteriorates, the network's confidence in the considered class diminishes across all networks and distortions.


Monitoring and managing noise levels is especially important in low-light conditions, where noise tends to be more pronounced. Techniques such as denoising algorithms or adaptive noise reduction filters can be employed to mitigate the effects of noise and enhance the quality of the processed images.

Scale your annotation workflows and power your model performance with data-driven insights
medical banner

How to Improve Quality Assurance Outputs with Encord

Analyze Data and Label Quality on Encord Active

Encord Active is a comprehensive tool that enables the computation, storage, inspection, manipulation, and utilization of quality metrics for your computer vision dataset. It provides a library of pre-existing quality metrics and, notably, offers customization options, allowing you to create your own metrics for calculating and computing quality measurements across your dataset.

AI-Assisted Labeling

Manual labeling is prone to errors and is time-consuming. To accelerate the labeling process, AI-assisted labeling tools, such as automation workflows, are indispensable. These tools save time, increase efficiency, and enhance the quality and consistency of labeled datasets.

Improved Annotator Management

Effective annotator management throughout the project reduces errors in the data pipeline. Project leaders should have continuous visibility into annotators' performance, quality of annotations, and progress. AI-assisted data labeling tools, equipped with project dashboards, facilitate better management and enable real-time adjustments.

Complex Ontological Structures for Labels

When labeling data for computer vision models, using complex ontological structures can improve the accuracy and relationship outlining of objects in images and videos. Simplified structures are insufficient for computer vision models, so employing more intricate structures yields better results.

💡Read the blog How to Improve the quality of your labeled dataset for more insight.

Quality Assurance Metrics Conclusion

In conclusion, tracking key computer vision quality metrics is crucial for building a robust monitoring solution. By monitoring image characteristics, robustness, error scores, drift, and various image quality aspects, you can improve algorithm performance and ensure accurate results. Stay informed about image dimensions, handle adversarial attacks, and address issues such as motion blur, optical distortion, and color constancy errors. With data-driven insights, you can make informed decisions to optimize your computer vision system for reliable and accurate performance.

Ready to automate and improve the quality, speed, and accuracy of your computer vision metrics? 

Sign-up for an Encord Free Trial: The Active Learning Platform for Computer Vision, used by the world’s leading computer vision teams. 

AI-assisted labeling, model training & diagnostics, find & fix dataset errors and biases, all in one collaborative active learning platform, to get to production AI faster. Try Encord for Free Today. 

Want to stay updated?

Encord data engine

The Unified AI Data Development Platform

Consolidates & automate your data curation, annotation, and validation workflows to power your AI models with the right data.

Written by

Akruti Acharya

View more posts