You are reading the documentation for MMClassification 0.x, which will soon be deprecated at the end of 2022. We recommend you upgrade to MMClassification 1.0 to enjoy fruitful new features and better performance brought by OpenMMLab 2.0. Check the installation tutorial, migration tutorial and changelog for more details.
- class mmcls.models.DeiTClsHead(*args, **kwargs)¶
Distilled Vision Transformer classifier head.
Comparing with the
VisionTransformerClsHead, this head adds an extra linear layer to handle the dist token. The final classification score is the average of both linear transformation results of
num_classes (int) – Number of categories excluding the background category.
in_channels (int) – Number of channels in the input feature map.
hidden_dim (int) – Number of the dimensions for hidden layer. Defaults to None, which means no extra hidden layer.
act_cfg (dict) – The activation config. Only available during pre-training. Defaults to
init_cfg (dict) – The extra initialization configs. Defaults to
dict(type='Constant', layer='Linear', val=0).
- simple_test(x, softmax=True, post_process=True)¶
Inference without augmentation.
x (tuple[tuple[tensor, tensor, tensor]]) – The input features. Multi-stage inputs are acceptable but only the last stage will be used to classify. Every item should be a tuple which includes patch token, cls token and dist token. The cls token and dist token will be used to classify and the shape of them should be
softmax (bool) – Whether to softmax the classification score.
post_process (bool) – Whether to do post processing the inference results. It will convert the output to a list.
The inference results.
If no post processing, the output is a tensor with shape
If post processing, the output is a multi-dimentional list of float and the dimensions are
- Return type
Tensor | list