Note
You are reading the documentation for MMClassification 0.x, which will soon be deprecated at the end of 2022. We recommend you upgrade to MMClassification 1.0 to enjoy fruitful new features and better performance brought by OpenMMLab 2.0. Check the installation tutorial, migration tutorial and changelog for more details.
ImageClassifier¶
- class mmcls.models.ImageClassifier(backbone, neck=None, head=None, pretrained=None, train_cfg=None, init_cfg=None)[source]¶
- extract_feat(img, stage='neck')[source]¶
Directly extract features from the specified stage.
- Parameters
img (Tensor) – The input images. The shape of it should be
(num_samples, num_channels, *img_shape)
.stage (str) – Which stage to output the feature. Choose from “backbone”, “neck” and “pre_logits”. Defaults to “neck”.
- Returns
- The output of specified stage.
The output depends on detailed implementation. In general, the output of backbone and neck is a tuple and the output of pre_logits is a tensor.
- Return type
tuple | Tensor
Examples
Backbone output
>>> import torch >>> from mmcv import Config >>> from mmcls.models import build_classifier >>> >>> cfg = Config.fromfile('configs/resnet/resnet18_8xb32_in1k.py').model >>> cfg.backbone.out_indices = (0, 1, 2, 3) # Output multi-scale feature maps >>> model = build_classifier(cfg) >>> outs = model.extract_feat(torch.rand(1, 3, 224, 224), stage='backbone') >>> for out in outs: ... print(out.shape) torch.Size([1, 64, 56, 56]) torch.Size([1, 128, 28, 28]) torch.Size([1, 256, 14, 14]) torch.Size([1, 512, 7, 7])
Neck output
>>> import torch >>> from mmcv import Config >>> from mmcls.models import build_classifier >>> >>> cfg = Config.fromfile('configs/resnet/resnet18_8xb32_in1k.py').model >>> cfg.backbone.out_indices = (0, 1, 2, 3) # Output multi-scale feature maps >>> model = build_classifier(cfg) >>> >>> outs = model.extract_feat(torch.rand(1, 3, 224, 224), stage='neck') >>> for out in outs: ... print(out.shape) torch.Size([1, 64]) torch.Size([1, 128]) torch.Size([1, 256]) torch.Size([1, 512])
Pre-logits output (without the final linear classifier head)
>>> import torch >>> from mmcv import Config >>> from mmcls.models import build_classifier >>> >>> cfg = Config.fromfile('configs/vision_transformer/vit-base-p16_pt-64xb64_in1k-224.py').model >>> model = build_classifier(cfg) >>> >>> out = model.extract_feat(torch.rand(1, 3, 224, 224), stage='pre_logits') >>> print(out.shape) # The hidden dims in head is 3072 torch.Size([1, 3072])
- forward_dummy(img)[source]¶
Used for computing network flops.
See mmclassificaiton/tools/analysis_tools/get_flops.py
- forward_train(img, gt_label, **kwargs)[source]¶
Forward computation during training.
- Parameters
img (Tensor) – of shape (N, C, H, W) encoding input images. Typically these should be mean centered and std scaled.
gt_label (Tensor) – It should be of shape (N, 1) encoding the ground-truth label of input images for single label task. It should be of shape (N, C) encoding the ground-truth label of input images for multi-labels task.
- Returns
a dictionary of loss components
- Return type