Rico's Nerd Cluster

「离开世界之前 一切都是过程」

Deep Learning - Spatial Pyramid Pooling

SPP, Ablation Study

SPP (Spatial Pyramid Pooling) Layer In Deep learning, “spatial dimensions” means “height and width”. Some networks need fixed input size for their fully connected layers. When does that happen? ...

Deep Learning - PyTorch Profiling

PyTorch Profiler

Sample Code 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 #!/usr/bin/env python3 import torch import torchvision.models as models from torch.profiler import profile...

Deep Learning - Tools

wandb, tqdm

Wandb wandb is a visualization tool that records various deep learning experiment data. It never discloses what databases it uses, but it might be a combination of cloud-based scalable databases s...

Deep Learning - Model Deployment

Model Deployment

Deployment On Edge Devices Inferencing Using C++ Embedded systems Need streaming. ZeroMQ might be better than Websocket. Need an article A serializable model mean...

Deep Learning - Knowledge Distillation

Knowledge Distillation

Introduction To Knowledge Distillation The goal of Knowledge Distillation is to train a small “student” network to mimic the output of a large “teacher” network. This is a.k.a “model compression” ...

Deep Learning - Inferencing

Autograd Profiler

Autograd Profiler PyTorch’s Autograd Profiler provides information on the resources (CPU and GPU) for each operation in a model. 1 2 3 4 5 6 import torch.autograd.profiler as profiler with profi...

Deep Learning - Mixed Floating Point Training

FP16, BF16, Mixed Precision Training

Refresher: Floating Point Calculation A floating point is represented as sign bit | exponent | mantissa. 0 | 10000001 | 10000000000000000000000 represents 6 because: Sign bit 0 represents posi...

Deep Learning - Speedup Tricks

Torch Optimizer Tricks, Mixed Precision Training

General Speed-Up Tricks If you look to use albumentations for augmentation, sticking to the [batch, H, W, Channels] (channel last) could make data loading faster tensor.contiguou...

Deep Learning - Common Oopsies

Underflow, Weight Manipulation

Underflow torch.softmax(X) X is zero due to underflow. Sizing Be careful with the last batch if you want to initialize any tensor that’s specific to each batch’s sizes, because it could b...

Deep Learning - Strategies Part 2 Training And Tuning

Bias And Variance, And Things To Try For Performance Improvement From My Experience

Orthogononalization Orthogonalization in ML means designing a machine learning system such that different aspects of the model can be adjusted independently. This is like “orthogonal vector” so th...