Wandb
wandb
is a visualization tool that records various deep learning experiment data. It never discloses what databases it uses, but it might be a combination of cloud-based scalable databases such as relational databases (PostgreSQL), non-relational databases (MongoDB, DynamoDB). Specifically, it keeps track of:
- Metrics: losses, accuracy, loss, precision, recall, etc.
- Model checkpoints: snapshot of model parameters during training for later retrieval and comparison
- Gradients and weights: can record changes in model weights during training.
- Images, Audio, Other media
How to get started? Their page has a good introduction
One nice feature of wandb
is that once you’ve set up your account and logged in on the training machine, you will get a link to your project and visualize almost live (you need to refresh the page though).
My boiler plate is:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
import wandb
wandb_logger = wandb.init(
project="Rico-mobilenetv2", resume="allow", anonymous="must"
)
wandb_logger.config.update(
dict(
epochs=NUM_EPOCHS,
batch_size=BATCH_SIZE * ACCUMULATION_STEPS,
learning_rate=LEARNING_RATE,
weight_decay=WEIGHT_DECAY,
training_size=len(train_dataset),
amp=USE_AMP,
optimizer=str(optimizer),
)
)
logging.info(
f"""🚀 Starting training🚀 :
Epochs: {NUM_EPOCHS}
Batch size: {BATCH_SIZE}
Learning rate: {LEARNING_RATE}
Weight decay: {WEIGHT_DECAY}
Training size: {len(train_dataset)}
Device: {device.type}
Mixed Precision: {USE_AMP},
Optimizer: {str(optimizer)}
"""
)
wandb.watch(model, log_freq=100)
model.train()
for batch_idx, (data, target) in enumerate(train_loader):
wandb_logger.log(
{
"epoch loss": epoch_loss,
"epoch": epoch,
"learning rate": current_lr,
"total_weight_norm": total_weight_norm,
"elapsed_time": timer.lapse_time(),
}
)
images_t = ... # generate or load images as PyTorch Tensors
wandb.log({"examples": [wandb.Image(im) for im in images_t]})
wandb.finish()
wandb.watch(model, log_freq=100)
logs gradients and weights every 100 batches, whenlog()
is called. For more, see here- Wandb can log images as well. For more, see here
tqdm
tqdm
creates a progress bar for iterables. Here, I have an example:
1
2
3
4
5
6
from tqdm import tqdm
with tqdm(total=image_num, desc=f'Epoch {epoch}/{epochs}', unit='img') as pbar:
pbar.update(inputs.size(0)) # Increment progress bar by number of images in the batch
...
pbar.set_postfix(**{'loss (batch)': loss.item()})
unit is ‘img’ so we can see ‘img/s’ at the progress bar. You should be able to see a progress bar:
1
Epoch 1/10: |███████████-------| 600/1000 [00:30<00:15, 25.00img/s, loss (batch)=0.542]
Or, we can use tqdm(iterable) -> iterable
and do not need to manually update it.
1
2
3
4
word_to_vec_map_unit_vectors = {
word: embedding / np.linalg.norm(embedding)
for word, embedding in tqdm(word_to_vec_map.items())
}
- Binary bytes
KiB
: kibibyte = 1024 bytes,MiB
: Mebibyte = 1024 KiB,GiB
= 1024 MiBPiB
,TiB
: 1024 GiB, Pebibyte = 1024 TiB
FiftyOne
- Running inferencing on GCP, $0.05/image,
- Grouding dino (object detection & language prompts) 2im/s
- Segment-Anything, 1 im/s
- Post-processing non-maxima, non-singular suppresion
-
Fiftyone supports vector db:
- Data augmentation with night, snow, and rain
- How?
- Data is far superior than models. Faster-RCNN (2015), a toy model, trained on 100M images
- no temporal tracking