# AveragePrecision¶

class mmcls.evaluation.AveragePrecision(average='macro', collect_device='cpu', prefix=None)[source]

Calculate the average precision with respect of classes.

AveragePrecision (AP) summarizes a precision-recall curve as the weighted mean of maximum precisions obtained for any r’>r, where r is the recall:

$\text{AP} = \sum_n (R_n - R_{n-1}) P_n$

Note that no approximation is involved since the curve is piecewise constant.

Parameters
• average (str | None) –

How to calculate the final metrics from every category. It supports two modes:

• ”macro”: Calculate metrics for each category, and calculate the mean value over all categories. The result of this mode is also called mAP.

• None: Calculate metrics of every category and output directly.

Defaults to “macro”.

• collect_device (str) – Device name used for collecting results from different ranks during distributed training. Must be ‘cpu’ or ‘gpu’. Defaults to ‘cpu’.

• prefix (str, optional) – The prefix that will be added in the metric names to disambiguate homonymous metrics of different evaluators. If prefix is not provided in the argument, self.default_prefix will be used instead. Defaults to None.

References

1

Wikipedia entry for the Average precision

Examples

>>> import torch
>>> from mmcls.evaluation import AveragePrecision
>>> # --------- The Basic Usage for one-hot pred scores ---------
>>> y_pred = torch.Tensor([[0.9, 0.8, 0.3, 0.2],
...                        [0.1, 0.2, 0.2, 0.1],
...                        [0.7, 0.5, 0.9, 0.3],
...                        [0.8, 0.1, 0.1, 0.2]])
>>> y_true = torch.Tensor([[1, 1, 0, 0],
...                        [0, 1, 0, 0],
...                        [0, 0, 1, 0],
...                        [1, 0, 0, 0]])
>>> AveragePrecision.calculate(y_pred, y_true)
tensor(70.833)
>>> # ------------------- Use with Evalutor -------------------
>>> from mmcls.structures import ClsDataSample
>>> from mmengine.evaluator import Evaluator
>>> data_samples = [
...     ClsDataSample().set_pred_score(i).set_gt_score(j)
...     for i, j in zip(y_pred, y_true)
... ]
>>> evaluator = Evaluator(metrics=AveragePrecision())
>>> evaluator.process(data_samples)
>>> evaluator.evaluate(5)
{'multi-label/mAP': 70.83333587646484}
>>> # Evaluate on each class
>>> evaluator = Evaluator(metrics=AveragePrecision(average=None))
>>> evaluator.process(data_samples)
>>> evaluator.evaluate(5)
{'multi-label/AP_classwise': [100., 83.33, 100., 0.]}

static calculate(pred, target, average='macro')[source]

Calculate the average precision for a single class.

Parameters
• pred (torch.Tensor | np.ndarray) – The model predictions with shape (N, num_classes).

• target (torch.Tensor | np.ndarray) – The target of predictions with shape (N, num_classes).

• average (str | None) –

The average method. It supports two modes:

• ”macro”: Calculate metrics for each category, and calculate the mean value over all categories. The result of this mode is also called mAP.

• None: Calculate metrics of every category and output directly.

Defaults to “macro”.

Returns

the average precision of all classes.

Return type

torch.Tensor

compute_metrics(results)[source]

Compute the metrics from processed results.

Parameters

results (list) – The processed results of each batch.

Returns

The computed metrics. The keys are the names of the metrics, and the values are corresponding results.

Return type

Dict

process(data_batch, data_samples)[source]

Process one batch of data samples.

The processed results should be stored in self.results, which will be used to computed the metrics when all batches have been processed.

Parameters
• data_batch – A batch of data from the dataloader.

• data_samples (Sequence[dict]) – A batch of outputs from the model.