Understanding mAP — Object Detection Scoring

1

Your model draws boxes around objects

What we're actually measuring

An object detection model looks at an image and does two things: it finds objects and draws bounding boxes around them. Each box comes with a confidence score.

✓ Good predictions

Boxes align with real objects

✗ Bad predictions

Sloppy box + hallucinated object

ⓘ

mAP is a single score from 0 to 1 that captures both skills at once: finding the right objects AND drawing tight boxes. Higher = better. 1.0 = perfect.

2

Judging a single box: IoU

Intersection over Union — the overlap score

For each predicted box, we measure how much it overlaps with the real box. This overlap ratio is IoU — from 0 (no overlap) to 1 (perfect match).

Move predicted box: IoU: 0.58

✓

At IoU threshold 0.50: this counts as a correct detection (True Positive)

The @50 in "mAP@50" means the threshold is 0.50. @75 means 75% overlap needed. @[0.5:0.95] averages across 10 thresholds from easy to very strict.

3

Precision and recall

Two sides of detection quality

Precision

"Of everything I detected, how many were real?"

High precision = few false alarms

TP / (TP + FP)

Recall

"Of all real objects, how many did I find?"

High recall = few missed objects

TP / (TP + FN)

⇆

There's always a trade-off. Accept more detections → recall up but precision drops. Be selective → precision high but you miss things. AP captures this balance.

4

Building the precision-recall curve

Walk through detections one by one

Sort detections by confidence (highest first). Walk through one at a time, keeping a running count. Each row gives one point on the PR curve.

#	Conf.	Match	TP	FP	Precision	Recall

Ground truth objects: 3

5

AP = area under the PR curve

The shaded area below the curve IS the score

Plot the PR points, smooth the curve (at each recall, take the max precision at any recall ≥ that point), then measure the shaded area.

▣

Think of it as filling a 1×1 square. Perfect model fills it all (AP=1.0). Weak model fills a sliver. The staircase steps are rectangles — just multiply width×height and add up.

6

From AP to mAP

Repeat for every object class, then average

Compute AP for every class, then average them. That's the m in mAP — mean Average Precision.

0.83

Car

0.76

Person

0.91

Dog

0.68

Bike

mAP@50 = mean( 0.83, 0.76, 0.91, 0.68 )

0.80

7

The multi-level exam

Why mAP@[0.5:0.95] is harder than mAP@50

Repeat everything at 10 difficulty levels. Easy (IoU=0.50) to nearly pixel-perfect (IoU=0.95). Then average all 10 scores.

Easy exam

0.80

IoU = 0.50

Medium

0.58

IoU = 0.75

Hard exam

0.08

IoU = 0.95

0.50 (lenient)IoU threshold0.95 (strict)

COCO mAP = average of all 10 bars

0.50

mAP@[0.5:0.95]

⚠

Why so much lower? mAP@50 was 0.80 but COCO mAP is only 0.50. The strict thresholds destroy models with sloppy boxes, dragging the average down. This rewards precise localization.

8

What does my score mean?

Drag the slider to interpret any mAP value

0.00.250.500.751.0

0.60+Excellent — state of the art, production ready

0.40–0.55Solid — usable for most real-world tasks

0.20–0.35Needs work — missing objects or sloppy boxes

Below 0.15Struggling — needs serious improvement

0.50

mAP@[0.5:0.95]

Solid model

Usable for most real-world tasks. Finds most objects with decent box accuracy.

9

Cheat sheet

Everything on one page

One sentence: mAP@[0.5:0.95] is a comprehensive exam score for your object detector — it tests whether the model finds the right objects AND draws precise boxes, averaged across easy-to-hard grading. Higher = better. Range: 0 to 1.

The full pipeline

Model outputs detections with confidence scores

Match each detection to ground truth using IoU at a threshold

Sort by confidence, walk through → build precision-recall table

Plot PR curve → area under it = AP for that class

Average AP across all classes = mAP at that threshold

Average mAP across 10 IoU thresholds = COCO mAP@[0.5:0.95]

Key terms

IoU: Overlap between predicted and real box (0–1)
Precision: Of detections made, how many were correct?
Recall: Of real objects, how many did we find?
AP: Area under PR curve — score for one class
mAP: Mean of AP across all classes
mAP@50: mAP with lenient 50% overlap threshold
mAP@[.5:.95]: Average across 10 thresholds — the gold standard

Understanding mAP inobject detection

The full pipeline

Key terms

Understanding mAP in
object detection