Batch Augmentation¶
Batch augmentation is the augmentation which involve multiple samples, such as Mixup and CutMix.
In MMClassification, these batch augmentation is used as a part of Classifier. A typical usage is as below:
model = dict(
backbone = ...,
neck = ...,
head = ...,
train_cfg=dict(augments=[
dict(type='BatchMixup', alpha=0.8, prob=0.5, num_classes=num_classes),
dict(type='BatchCutMix', alpha=1.0, prob=0.5, num_classes=num_classes),
]))
)
Mixup¶
- class mmcls.models.utils.augment.BatchMixupLayer(*args, **kwargs)[source]¶
Mixup layer for a batch of data.
Mixup is a method to reduces the memorization of corrupt labels and increases the robustness to adversarial examples. It’s proposed in mixup: Beyond Empirical Risk Minimization <https://arxiv.org/abs/1710.09412>
This method simply linearly mix pairs of data and their labels.
- Parameters
Note
The \(\alpha\) (
alpha
) determines a random distribution \(Beta(\alpha, \alpha)\). For each batch of data, we sample a mixing ratio (marked as \(\lambda\),lam
) from the random distribution.
CutMix¶
- class mmcls.models.utils.augment.BatchCutMixLayer(*args, **kwargs)[source]¶
CutMix layer for a batch of data.
CutMix is a method to improve the network’s generalization capability. It’s proposed in CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features <https://arxiv.org/abs/1905.04899>
With this method, patches are cut and pasted among training images where the ground truth labels are also mixed proportionally to the area of the patches.
- Parameters
alpha (float) – Parameters for Beta distribution to generate the mixing ratio. It should be a positive number. More details can be found in
BatchMixupLayer
.num_classes (int) – The number of classes
prob (float) – The probability to execute cutmix. It should be in range [0, 1]. Defaults to 1.0.
cutmix_minmax (List[float], optional) – The min/max area ratio of the patches. If not None, the bounding-box of patches is uniform sampled within this ratio range, and the
alpha
will be ignored. Otherwise, the bounding-box is generated according to thealpha
. Defaults to None.correct_lam (bool) – Whether to apply lambda correction when cutmix bbox clipped by image borders. Defaults to True.
Note
If the
cutmix_minmax
is None, how to generate the bounding-box of patches according to thealpha
?First, generate a \(\lambda\), details can be found in
BatchMixupLayer
. And then, the area ratio of the bounding-box is calculated by:\[\text{ratio} = \sqrt{1-\lambda}\]
ResizeMix¶
- class mmcls.models.utils.augment.BatchResizeMixLayer(alpha, num_classes, lam_min: float = 0.1, lam_max: float = 0.8, interpolation='bilinear', prob=1.0, cutmix_minmax=None, correct_lam=True, **kwargs)[source]¶
ResizeMix Random Paste layer for a batch of data.
The ResizeMix will resize an image to a small patch and paste it on another image. It’s proposed in ResizeMix: Mixing Data with Preserved Object Information and True Labels
- Parameters
alpha (float) – Parameters for Beta distribution to generate the mixing ratio. It should be a positive number. More details can be found in
BatchMixupLayer
.num_classes (int) – The number of classes.
lam_min (float) – The minimum value of lam. Defaults to 0.1.
lam_max (float) – The maximum value of lam. Defaults to 0.8.
interpolation (str) – algorithm used for upsampling: ‘nearest’ | ‘linear’ | ‘bilinear’ | ‘bicubic’ | ‘trilinear’ | ‘area’. Default to ‘bilinear’.
prob (float) – The probability to execute resizemix. It should be in range [0, 1]. Defaults to 1.0.
cutmix_minmax (List[float], optional) – The min/max area ratio of the patches. If not None, the bounding-box of patches is uniform sampled within this ratio range, and the
alpha
will be ignored. Otherwise, the bounding-box is generated according to thealpha
. Defaults to None.correct_lam (bool) – Whether to apply lambda correction when cutmix bbox clipped by image borders. Defaults to True
**kwargs – Any other parameters accpeted by
BatchCutMixLayer
.
Note
The \(\lambda\) (
lam
) is the mixing ratio. It’s a random variable which follows \(Beta(\alpha, \alpha)\) and is mapped to the range [lam_min
,lam_max
].\[\lambda = \frac{Beta(\alpha, \alpha)} {\lambda_{max} - \lambda_{min}} + \lambda_{min}\]And the resize ratio of source images is calculated by \(\lambda\):
\[\text{ratio} = \sqrt{1-\lambda}\]