Shortcuts

Note

You are reading the documentation for MMClassification 0.x, which will soon be deprecated at the end of 2022. We recommend you upgrade to MMClassification 1.0 to enjoy fruitful new features and better performance brought by OpenMMLab 2.0. Check the installation tutorial, migration tutorial and changelog for more details.

ImageClassifier

class mmcls.models.ImageClassifier(backbone, neck=None, head=None, pretrained=None, train_cfg=None, init_cfg=None)[source]
extract_feat(img, stage='neck')[source]

Directly extract features from the specified stage.

Parameters
  • img (Tensor) – The input images. The shape of it should be (num_samples, num_channels, *img_shape).

  • stage (str) – Which stage to output the feature. Choose from “backbone”, “neck” and “pre_logits”. Defaults to “neck”.

Returns

The output of specified stage.

The output depends on detailed implementation. In general, the output of backbone and neck is a tuple and the output of pre_logits is a tensor.

Return type

tuple | Tensor

Examples

  1. Backbone output

>>> import torch
>>> from mmcv import Config
>>> from mmcls.models import build_classifier
>>>
>>> cfg = Config.fromfile('configs/resnet/resnet18_8xb32_in1k.py').model
>>> cfg.backbone.out_indices = (0, 1, 2, 3)  # Output multi-scale feature maps
>>> model = build_classifier(cfg)
>>> outs = model.extract_feat(torch.rand(1, 3, 224, 224), stage='backbone')
>>> for out in outs:
...     print(out.shape)
torch.Size([1, 64, 56, 56])
torch.Size([1, 128, 28, 28])
torch.Size([1, 256, 14, 14])
torch.Size([1, 512, 7, 7])
  1. Neck output

>>> import torch
>>> from mmcv import Config
>>> from mmcls.models import build_classifier
>>>
>>> cfg = Config.fromfile('configs/resnet/resnet18_8xb32_in1k.py').model
>>> cfg.backbone.out_indices = (0, 1, 2, 3)  # Output multi-scale feature maps
>>> model = build_classifier(cfg)
>>>
>>> outs = model.extract_feat(torch.rand(1, 3, 224, 224), stage='neck')
>>> for out in outs:
...     print(out.shape)
torch.Size([1, 64])
torch.Size([1, 128])
torch.Size([1, 256])
torch.Size([1, 512])
  1. Pre-logits output (without the final linear classifier head)

>>> import torch
>>> from mmcv import Config
>>> from mmcls.models import build_classifier
>>>
>>> cfg = Config.fromfile('configs/vision_transformer/vit-base-p16_pt-64xb64_in1k-224.py').model
>>> model = build_classifier(cfg)
>>>
>>> out = model.extract_feat(torch.rand(1, 3, 224, 224), stage='pre_logits')
>>> print(out.shape)  # The hidden dims in head is 3072
torch.Size([1, 3072])
forward_dummy(img)[source]

Used for computing network flops.

See mmclassificaiton/tools/analysis_tools/get_flops.py

forward_train(img, gt_label, **kwargs)[source]

Forward computation during training.

Parameters
  • img (Tensor) – of shape (N, C, H, W) encoding input images. Typically these should be mean centered and std scaled.

  • gt_label (Tensor) – It should be of shape (N, 1) encoding the ground-truth label of input images for single label task. It should be of shape (N, C) encoding the ground-truth label of input images for multi-labels task.

Returns

a dictionary of loss components

Return type

dict[str, Tensor]

simple_test(img, img_metas=None, **kwargs)[source]

Test without augmentation.

Read the Docs v: master
Versions
master
latest
1.x
dev-1.x
Downloads
html
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.