Utilities for creating augmentation pipelines mentioned in popular self supervised learning papers.

class RandomGaussianBlur[source]

RandomGaussianBlur(p=0.5, s=(8, 32), same_on_batch=False, **kwargs) :: RandTransform

Randomly apply gaussian blur with probability p with a value of s

Why kornia, torchvision or fastai?

These libraries are preferred over others for their batch transform capabilities. Being able to apply these transforms on batches allow us to use GPUs and get a speed up around ~10-20x depending on the input image size.

Kornia

Kornia has ability to pass same_on_batch argument. If it's set to False, then augmentation will be randomly applied to elements of a batch.

jitter: Do color jitter or not bw: Do grayscale or not, blur: do blur or not, resize_scale: (min,max) scales for random resized crop, resize_ratio: (min,max) aspect ratios to use for random resized crop, s: scalar for color jitter, blur_s: (min, max) or single int for blur.

Their corresponding probabilities: flip_p, jitter_p, bw_p, blur_p

Kornia augmentation implementations have two additional parameters compare to TorchVision, return_transform and same_on_batch. The former provides the ability of undoing one geometry transformation while the latter can be used to control the randomness for a batched transformation. To enable those behaviour, you may simply set the flags to True.

Recommendation: Even though defaults work very well on many benchmark datasets it's always better to try different values and visualize your dataset before going further with training.

get_kornia_batch_augs[source]

get_kornia_batch_augs(size, rotate=True, jitter=True, bw=True, blur=True, resize_scale=(0.2, 1.0), resize_ratio=(0.75, 1.3333333333333333), rotate_deg=30, jitter_s=0.6, blur_s=(4, 32), same_on_batch=False, flip_p=0.5, jitter_p=0.3, bw_p=0.3, blur_p=0.3, stats=([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]), cuda=True, xtra_tfms=[])

Input batch augmentations implemented in kornia

kornia RandomResizedCrop in overall looks more zoomed in. Might be related to sampling function used for scale?

aug, n = get_kornia_batch_augs(336, resize_scale=(0.2,1), stats=imagenet_stats, cuda=False, same_on_batch=False), 5
fig,ax = plt.subplots(n,2,figsize=(8,n*4))
for i in range(n): 
    show_image(t1,ax=ax[i][0])
    show_image(aug.decode(aug(t1)).clamp(0,1)[0], ax=ax[i][1])

GPU batch transforms are ~10x - ~20x faster than CPU depending on image size. Larger image sizes benefit from the GPU more.

xb = (torch.stack([t1]*32))
aug= get_kornia_batch_augs(336, resize_scale=(0.75,1),  stats=imagenet_stats, cuda=False)
%%timeit
out = aug(xb)
269 ms ± 48.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
if torch.cuda.is_available():
    xb = xb.to(default_device())
    aug = get_kornia_batch_augs(336, resize_scale=(0.75,1), stats=imagenet_stats)
%%timeit 
if torch.cuda.is_available():
    out = aug(xb) # ignore: GPU warmup
    torch.cuda.synchronize()
31.7 ms ± 1.24 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

same_on_batch=False

%%timeit
if torch.cuda.is_available():
    out = aug(xb)
    torch.cuda.synchronize()
31.2 ms ± 443 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
if torch.cuda.is_available():
    xb = xb.to(default_device())
    aug = get_kornia_batch_augs(336, resize_scale=(0.75,1), same_on_batch=True, stats=imagenet_stats)

same_on_batch=True

%%timeit
if torch.cuda.is_available():
    out = aug(xb)
    torch.cuda.synchronize()
23.1 ms ± 2.25 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Torchvision

Torchvision doesn't have a same_on_batch parameter, it also doesn't support jitter_p.

get_torchvision_batch_augs[source]

get_torchvision_batch_augs(size, rotate=True, jitter=True, bw=True, blur=True, resize_scale=(0.2, 1.0), resize_ratio=(0.75, 1.3333333333333333), rotate_deg=30, jitter_s=0.6, blur_s=(4, 32), flip_p=0.5, bw_p=0.3, blur_p=0.3, stats=([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]), cuda=True, xtra_tfms=[])

Input batch augmentations implemented in torchvision

aug, n = get_torchvision_batch_augs(336, resize_scale=(0.2, 1), stats=imagenet_stats, cuda=False), 5
fig,ax = plt.subplots(n,2,figsize=(8,n*4))
for i in range(n): 
    show_image(t1,ax=ax[i][0])
    show_image(aug.decode(aug(t1)).clamp(0,1)[0], ax=ax[i][1])

Torchvision is slightly faster than kornia with same_on_batch=False.

xb = (torch.stack([t1]*32))
aug= get_torchvision_batch_augs(336, resize_scale=(0.75,1), stats=imagenet_stats, cuda=False)
%%timeit
out = aug(xb)
753 ms ± 142 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
if torch.cuda.is_available():
    xb = xb.to(default_device())
    aug = get_torchvision_batch_augs(336, resize_scale=(0.75,1), stats=imagenet_stats)
%%timeit
if torch.cuda.is_available():
    out = aug(xb)
    torch.cuda.synchronize()
20.8 ms ± 216 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Fastai

In fastai few of the batch transforms are named differently, that is why it is not used as first choice. There might be better or worse implementation difference. Although, in general fastai has a faster and more accurate batch transform through a composition function called setup_aug_tfms.

Here, max_lightning for color jitter magnitude.

Fastai is as fast as the combination of kornia and torchvision, but it should be noted that RandomResizedCropGPU applies same crop to all elements (which is probably fine) similar to torchvision and color jittering is implemented in 4 separate transforms.

get_fastai_batch_augs[source]

get_fastai_batch_augs(size, rotate=True, jitter=True, bw=True, blur=True, min_scale=0.2, resize_ratio=(0.75, 1.3333333333333333), max_lighting=0.2, rotate_deg=30, blur_s=(8, 32), same_on_batch=False, flip_p=0.5, jitter_p=0.3, bw_p=0.3, blur_p=0.3, stats=([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]), cuda=True, xtra_tfms=[])

Input batch augmentations implemented in kornia

aug, n = get_fastai_batch_augs(336, min_scale=0.2, stats=imagenet_stats, cuda=False), 5
fig,ax = plt.subplots(n,2,figsize=(8,n*4))
for i in range(n): 
    show_image(t1,ax=ax[i][0])
    show_image(aug.decode(aug(t1)).clamp(0,1)[0], ax=ax[i][1])
xb = (torch.stack([t1]*32))
aug = get_fastai_batch_augs(336, min_scale=0.75, stats=imagenet_stats, cuda=False)
%%timeit
out = aug(xb)
831 ms ± 142 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
if torch.cuda.is_available():
    xb = xb.to(default_device())
    aug = get_fastai_batch_augs(336, min_scale=0.75, stats=imagenet_stats)
%%timeit
if torch.cuda.is_available():
    out = aug(xb)
    torch.cuda.synchronize()
29.7 ms ± 1.03 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Kornia + Torchvision + Fastai

Here we use RandomResizedCrop from torchvision and keep the remaining augmentations same as kornia. This is kind of best of both worlds - fast and diverse augmentations.

Also, Rotate from fastai is used for reflection padding.

get_batch_augs[source]

get_batch_augs(size, rotate=True, jitter=True, bw=True, blur=True, resize_scale=(0.2, 1.0), resize_ratio=(0.75, 1.3333333333333333), rotate_deg=30, jitter_s=0.6, blur_s=(4, 32), same_on_batch=False, flip_p=0.5, rotate_p=0.3, jitter_p=0.3, bw_p=0.3, blur_p=0.3, stats=([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]), cuda=True, xtra_tfms=[])

Input batch augmentations implemented in tv+kornia+fastai

aug, n = get_batch_augs(336, resize_scale=(0.2, 1), stats=imagenet_stats,cuda=False), 5
fig,ax = plt.subplots(n,2,figsize=(8,n*4))
for i in range(n): 
    show_image(t1,ax=ax[i][0])
    show_image(aug.decode(aug(t1)).clamp(0,1)[0], ax=ax[i][1])

Torchvision is slightly faster than kornia same_on_batch=True.

xb = (torch.stack([t1]*32))
aug = get_batch_augs(336, resize_scale=(0.75,1), stats=imagenet_stats, cuda=False)
%%timeit
out = aug(xb)
268 ms ± 67.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
if torch.cuda.is_available():
    xb = xb.to(default_device())
    aug = get_batch_augs(336, resize_scale=(0.75,1), stats=imagenet_stats)
%%timeit
if torch.cuda.is_available():
    out = aug(xb)
    torch.cuda.synchronize()
19.5 ms ± 1.07 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

TODO: RandAugment

TODO: Albumentations

Adding Extra tfms

You can simply add any batch transform by passing it as list to xtra_tfms.

aug, n = get_batch_augs(336, resize_scale=(0.2, 1),  stats=imagenet_stats, cuda=False, xtra_tfms=[RandomErasing(p=1.)]), 5
fig,ax = plt.subplots(n,2,figsize=(8,n*4))
for i in range(n): 
    show_image(t1,ax=ax[i][0])
    show_image(aug.decode(aug(t1)).clamp(0,1)[0], ax=ax[i][1])

Utilities

get_multi_aug_pipelines[source]

get_multi_aug_pipelines(n, size, rotate=True, jitter=True, bw=True, blur=True, resize_scale=(0.2, 1.0), resize_ratio=(0.75, 1.3333333333333333), rotate_deg=30, jitter_s=0.6, blur_s=(4, 32), same_on_batch=False, flip_p=0.5, rotate_p=0.3, jitter_p=0.3, bw_p=0.3, blur_p=0.3, stats=([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]), cuda=True, xtra_tfms=[])

assert_aug_pipelines[source]

assert_aug_pipelines(aug_pipelines:List[Pipeline])

augs = get_multi_aug_pipelines(n=2,size=224)
assert_aug_pipelines(augs)