A visual guide from plain English to the full technical picture
1
Your model draws boxes around objects
What we're actually measuring
An object detection model looks at an image and does two things: it finds objects and draws bounding boxes around them. Each box comes with a confidence score.
✓ Good predictions
Boxes align with real objects
✗ Bad predictions
Sloppy box + hallucinated object
ⓘ
mAP is a single score from 0 to 1 that captures both skills at once: finding the right objects AND drawing tight boxes. Higher = better. 1.0 = perfect.
2
Judging a single box: IoU
Intersection over Union — the overlap score
For each predicted box, we measure how much it overlaps with the real box. This overlap ratio is IoU — from 0 (no overlap) to 1 (perfect match).
Move predicted box:IoU: 0.58
✓
At IoU threshold 0.50: this counts as a correct detection (True Positive)
The @50 in "mAP@50" means the threshold is 0.50. @75 means 75% overlap needed. @[0.5:0.95] averages across 10 thresholds from easy to very strict.
3
Precision and recall
Two sides of detection quality
Precision
"Of everything I detected, how many were real?"
High precision = few false alarms
TP / (TP + FP)
Recall
"Of all real objects, how many did I find?"
High recall = few missed objects
TP / (TP + FN)
⇆
There's always a trade-off. Accept more detections → recall up but precision drops. Be selective → precision high but you miss things. AP captures this balance.
4
Building the precision-recall curve
Walk through detections one by one
Sort detections by confidence (highest first). Walk through one at a time, keeping a running count. Each row gives one point on the PR curve.
#
Conf.
Match
TP
FP
Precision
Recall
Ground truth objects: 3
5
AP = area under the PR curve
The shaded area below the curve IS the score
Plot the PR points, smooth the curve (at each recall, take the max precision at any recall ≥ that point), then measure the shaded area.
▣
Think of it as filling a 1×1 square. Perfect model fills it all (AP=1.0). Weak model fills a sliver. The staircase steps are rectangles — just multiply width×height and add up.
6
From AP to mAP
Repeat for every object class, then average
Compute AP for every class, then average them. That's the m in mAP — mean Average Precision.
0.83
Car
0.76
Person
0.91
Dog
0.68
Bike
mAP@50 = mean( 0.83, 0.76, 0.91, 0.68 )
0.80
7
The multi-level exam
Why mAP@[0.5:0.95] is harder than mAP@50
Repeat everything at 10 difficulty levels. Easy (IoU=0.50) to nearly pixel-perfect (IoU=0.95). Then average all 10 scores.
Easy exam
0.80
IoU = 0.50
Medium
0.58
IoU = 0.75
Hard exam
0.08
IoU = 0.95
0.50 (lenient)IoU threshold0.95 (strict)
COCO mAP = average of all 10 bars
0.50
mAP@[0.5:0.95]
⚠
Why so much lower? mAP@50 was 0.80 but COCO mAP is only 0.50. The strict thresholds destroy models with sloppy boxes, dragging the average down. This rewards precise localization.
8
What does my score mean?
Drag the slider to interpret any mAP value
0.00.250.500.751.0
0.60+Excellent — state of the art, production ready
0.40–0.55Solid — usable for most real-world tasks
0.20–0.35Needs work — missing objects or sloppy boxes
Below 0.15Struggling — needs serious improvement
0.50
mAP@[0.5:0.95]
Solid model
Usable for most real-world tasks. Finds most objects with decent box accuracy.
9
Cheat sheet
Everything on one page
One sentence: mAP@[0.5:0.95] is a comprehensive exam score for your object detector — it tests whether the model finds the right objects AND draws precise boxes, averaged across easy-to-hard grading. Higher = better. Range: 0 to 1.
The full pipeline
Model outputs detections with confidence scores
Match each detection to ground truth using IoU at a threshold
Sort by confidence, walk through → build precision-recall table
Plot PR curve → area under it = AP for that class
Average AP across all classes = mAP at that threshold
Average mAP across 10 IoU thresholds = COCO mAP@[0.5:0.95]