Shortcuts

ImageClassifier

class mmcls.models.ImageClassifier(backbone, neck=None, head=None, pretrained=None, train_cfg=None, init_cfg=None)[source]
extract_feat(img, stage='neck')[source]

Directly extract features from the specified stage.

Parameters
  • img (Tensor) – The input images. The shape of it should be (num_samples, num_channels, *img_shape).

  • stage (str) – Which stage to output the feature. Choose from “backbone”, “neck” and “pre_logits”. Defaults to “neck”.

Returns

The output of specified stage.

The output depends on detailed implementation. In general, the output of backbone and neck is a tuple and the output of pre_logits is a tensor.

Return type

tuple | Tensor

Examples

  1. Backbone output

>>> import torch
>>> from mmcv import Config
>>> from mmcls.models import build_classifier
>>>
>>> cfg = Config.fromfile('configs/resnet/resnet18_8xb32_in1k.py').model
>>> cfg.backbone.out_indices = (0, 1, 2, 3)  # Output multi-scale feature maps
>>> model = build_classifier(cfg)
>>> outs = model.extract_feat(torch.rand(1, 3, 224, 224), stage='backbone')
>>> for out in outs:
...     print(out.shape)
torch.Size([1, 64, 56, 56])
torch.Size([1, 128, 28, 28])
torch.Size([1, 256, 14, 14])
torch.Size([1, 512, 7, 7])
  1. Neck output

>>> import torch
>>> from mmcv import Config
>>> from mmcls.models import build_classifier
>>>
>>> cfg = Config.fromfile('configs/resnet/resnet18_8xb32_in1k.py').model
>>> cfg.backbone.out_indices = (0, 1, 2, 3)  # Output multi-scale feature maps
>>> model = build_classifier(cfg)
>>>
>>> outs = model.extract_feat(torch.rand(1, 3, 224, 224), stage='neck')
>>> for out in outs:
...     print(out.shape)
torch.Size([1, 64])
torch.Size([1, 128])
torch.Size([1, 256])
torch.Size([1, 512])
  1. Pre-logits output (without the final linear classifier head)

>>> import torch
>>> from mmcv import Config
>>> from mmcls.models import build_classifier
>>>
>>> cfg = Config.fromfile('configs/vision_transformer/vit-base-p16_pt-64xb64_in1k-224.py').model
>>> model = build_classifier(cfg)
>>>
>>> out = model.extract_feat(torch.rand(1, 3, 224, 224), stage='pre_logits')
>>> print(out.shape)  # The hidden dims in head is 3072
torch.Size([1, 3072])
forward_dummy(img)[source]

Used for computing network flops.

See mmclassificaiton/tools/analysis_tools/get_flops.py

forward_train(img, gt_label, **kwargs)[source]

Forward computation during training.

Parameters
  • img (Tensor) – of shape (N, C, H, W) encoding input images. Typically these should be mean centered and std scaled.

  • gt_label (Tensor) – It should be of shape (N, 1) encoding the ground-truth label of input images for single label task. It should be of shape (N, C) encoding the ground-truth label of input images for multi-labels task.

Returns

a dictionary of loss components

Return type

dict[str, Tensor]

simple_test(img, img_metas=None, **kwargs)[source]

Test without augmentation.

Read the Docs v: latest
Versions
master
latest
1.x
dev-1.x
Downloads
html
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.