Pre-processing
Shuffle Data
1
2
3
4
5
6
7
8
9
shuffled_main_dataset = torch.utils.data.Subset(
main_dataset,
torch.randperm(dataset_size)
)
# train_dataset is a Subset object
# main_dataset becomes train_dataset.dataset
class_num = len(shuffled_main_dataset.dataset.classes)
Albumentations
Albumentations
is a library for pixel-wise image augmentations. It was developed by a few Kaggle experts, masters, and grandmasters.
For the full list of augmentations, please see here
pip install albumentations
1
2
3
4
5
6
7
8
9
10
11
12
13
14
import albumentations as A
# Define the augmentation pipeline, and add mask as an input arg during object initiation
transform = A.Compose([
A.HorizontalFlip(p=0.5),
A.RandomBrightnessContrast(p=0.2),
A.ElasticTransform(p=0.2),
# Add more augmentations as needed
], additional_targets={'mask': 'mask'})
augmented = transform(image=image, mask=mask)
augmented_image = augmented['image']
augmented_mask = augmented['mask']
- Standard practice in PyTorch is to augment in
Dataset.__getitem()__
.