Mean Average Precision (mAP)

Encord Computer Vision Glossary

Mean Average Precision (mAP) is a widely used performance metric in object detection tasks in machine learning. It measures the accuracy of object detection models by considering both the precision and recall at different levels of object detection confidence thresholds.

To calculate mAP, the average precision (AP) is first calculated for each class. AP is the area under the precision-recall curve for a given class. The AP for each class is then averaged to obtain the mAP score, which represents the overall performance of the object detection model.

Where,

Pn and Rn are the precision and recall at the nth threshold.

Scale your annotation workflows and power your model performance with data-driven insights
medical banner

mAP is a critical performance metric for object detection models, as it provides a comprehensive evaluation of the model's accuracy at various levels of confidence thresholds. It takes into account the accuracy of the predictions at different confidence thresholds, offering a more robust assessment of model performance than just accuracy or precision alone. mAP is widely used in various machine learning applications, including autonomous driving, face recognition, and image segmentation.

What is the significance of mAP for model comparison?

The significance of mean average precision (mAP) for model comparison lies in its ability to provide a fair and objective evaluation metric for object detection models. By considering both precision and recall, mAP offers a comprehensive assessment of a model's performance in detecting objects accurately.

When comparing object detection models, it is crucial to have a metric that captures the overall performance, rather than relying solely on individual metrics like accuracy or precision. mAP provides a single numerical value that represents the average precision across different confidence thresholds, taking into account the precision-recall trade-off.

Using mAP for model comparison ensures a standardized evaluation approach, allowing researchers and practitioners to rank and compare models objectively. It helps in identifying the most effective and robust models for specific object detection tasks, aiding in the decision-making process for model selection or deployment.

Variations in mAP

There are variations of mean average precision (mAP) that are used in different contexts or with specific requirements. Some common variations include:

  • mAP@[IoU threshold]: This variation considers the Intersection over Union (IoU) between predicted and ground-truth bounding boxes. By setting different IoU thresholds (e.g., 0.5, 0.75), mAP@[IoU threshold] measures the accuracy of object detection at varying levels of overlap between predicted and ground-truth boxes.
  • Weighted mAP: In cases where certain classes are more important or have different levels of significance, a weighted mAP may be used. This variation assigns different weights to individual classes, reflecting their relative importance, and computes an overall weighted mAP.
  • Area-specific mAP: This variation focuses on evaluating object detection performance for specific regions of interest or areas within an image. It allows for assessing the model's accuracy and robustness in detecting objects in particular areas of importance.
  • Localization mAP: In addition to evaluating object detection, this variation specifically assesses the model's ability to accurately localize objects by considering the precision and recall of bounding box predictions.

Scale your annotation workflows and power your model performance with data-driven insights
medical banner

These variations provide flexibility in evaluating and comparing object detection models based on specific requirements or areas of interest. Researchers and practitioners can choose the appropriate variation depending on the objectives and context of their specific tasks.

In summary, 

  • Mean Average Precision (mAP) is a performance metric commonly used in object detection tasks in machine learning. 
  • Measures the accuracy by considering both the precision and recall at different levels of confidence thresholds.
cta banner

Discuss this blog on Slack

Join the Encord Developers community to discuss the latest in computer vision, machine learning, and data-centric AI

Join the community
cta banner

Automate 97% of your annotation tasks with 99% accuracy