mmcls.models¶
The models
package contains several sub-packages for addressing the different components of a model.
classifiers
: The top-level module which defines the whole process of a classification model.backbones
: Usually a feature extraction network, e.g., ResNet, MobileNet.necks
: The component between backbones and heads, e.g., GlobalAveragePooling.heads
: The component for specific tasks. In MMClassification, we provides heads for classification.losses
: Loss functions.utils
: Some helper functions and common components used in various networks.data_preprocessor
: The component before model to preprocess the inputs, e.g., ClsDataPreprocessor.Common Components: Common components used in various networks.
Helper Functions: Helper functions.
Build Functions¶
Build classifier. |
|
Build backbone. |
|
Build neck. |
|
Build head. |
|
Build loss. |
Classifiers¶
Base class for classifiers. |
|
Image classifiers for supervised classification task. |
|
Image classifiers for pytorch-image-models (timm) model. |
|
Image classifiers for HuggingFace model. |
Backbones¶
AlexNet backbone. |
|
Backbone for BEiT. |
|
CSP-Darknet backbone used in YOLOv4. |
|
The abstract CSP Network class. |
|
CSP-ResNeXt backbone. |
|
CSP-ResNet backbone. |
|
Conformer backbone. |
|
ConvMixer. |
|
ConvNeXt. |
|
DaViT. |
|
DeiT3 backbone. |
|
DenseNet. |
|
Distilled Vision Transformer. |
|
EdgeNeXt. |
|
EfficientFormer. |
|
EfficientNet backbone. |
|
EfficientNetV2 backbone. |
|
HRNet backbone. |
|
HorNet backbone. |
|
Inception V3 backbone. |
|
LeNet5 backbone. |
|
Multi-scale ViT v2. |
|
Mlp-Mixer backbone. |
|
MobileNetV2 backbone. |
|
MobileNetV3 backbone. |
|
MobileOne backbone. |
|
MobileViT backbone. |
|
The backbone of Twins-PCPVT. |
|
PoolFormer. |
|
RegNet backbone. |
|
RepLKNet backbone. |
|
RepMLPNet backbone. |
|
RepVGG backbone. |
|
Res2Net backbone. |
|
ResNeSt backbone. |
|
ResNeXt backbone. |
|
ResNet backbone. |
|
ResNetV1c backbone. |
|
ResNetV1d backbone. |
|
ResNet backbone for CIFAR. |
|
Reversible Vision Transformer. |
|
SEResNeXt backbone. |
|
SEResNet backbone. |
|
The backbone of Twins-SVT. |
|
ShuffleNetV1 backbone. |
|
ShuffleNetV2 backbone. |
|
Swin Transformer. |
|
Swin Transformer V2. |
|
Tokens-to-Token Vision Transformer (T2T-ViT) |
|
Wrapper to use backbones from timm library. |
|
Transformer in Transformer. |
|
Visual Attention Network. |
|
VGG backbone. |
|
Vision Transformer. |
Necks¶
Global Average Pooling neck. |
|
Generalized Mean Pooling neck. |
|
Fuse feature map of multiple scales in HRNet. |
Heads¶
Classification head. |
|
Linear classifier head. |
|
Classifier head with several hidden fc layer and a output fc layer. |
|
Vision Transformer classifier head. |
|
EfficientFormer classifier head. |
|
Distilled Vision Transformer classifier head. |
|
Linear classifier head. |
|
ArcFace classifier head. |
|
Classification head for multilabel task. |
|
Linear classification head for multilabel task. |
|
Class-specific residual attention classifier head. |
Losses¶
Cross entropy loss. |
|
Initializer for the label smoothed cross entropy loss. |
|
Focal loss. |
|
asymmetric loss. |
|
Implementation of seesaw loss. |
models.utils¶
This package includes some helper functions and common components used in various networks.
Common Components¶
Inverted Residual Block. |
|
Squeeze-and-Excitation Module. |
|
Window based multi-head self-attention (W-MSA) module with relative position bias. |
|
Window based multi-head self-attention (W-MSA) module with relative position bias. |
|
Shift Window Multihead Self-Attention Module. |
|
Multi-head Attention Module. |
|
The Conditional Position Encoding (CPE) module. |
|
Image to Patch Embedding. |
|
Merge patch feature map. |
|
CNN Feature Map Embedding. |
|
LayerScale layer. |
Helper Functions¶
Channel Shuffle operation. |
|
Make divisible function. |
|
Resize pos_embed weights. |
|
Resize relative position bias table. |
|
A to_tuple function generator. |
|
Determine whether the model is called during the tracing of code with |