Rico's Nerd Cluster

「离开世界之前 一切都是过程」

[Robotics] From Market-Ready ROVs to Low-Cost AUVs: Building Practical Underwater Autonomy

This is a summary of From Market-Ready ROVs to Low-Cost AUVs BlueROV2 is a small 6-DoF ROV ROV platform that can operate at depths up to 100 meters, weighs under 12 kg, and includes a tether of up...

DETR


Data Augmentation For Object Detection

TODO YOLOv8 employs a series of data augmentation techniques, including Mosaic, Mixup, random perspective transformation, and HSV augmentation. Additionally, the input images undergo meticulous pr...

Muon SGD

Muon means MomentUm Orthogonalized by Newton–Schulz. In practice, it takes the usual SGD momentum update for a 2D weight matrix, then orthogonalizes that update before applying it. NVIDIA’s Emergin...

Progloss

ProgLoss is a training-time loss schedule for object detection. Instead of keeping classification and localization weights fixed for all epochs, it changes them over training progress: \[L_{\opera...

Small Target Label Assignment (STAL)

STAL STAL = Small-Target-Aware Label Assignment. It is a training-time matching rule for object detectors: it decides which predicted boxes / anchors should be treated as positive samples for each...

Target Label Assignment (STAL)

TOOD

TAL TAL usually means Task-Aligned Label Assignment or Task Alignment Learning in object detection. It is used during training, not inference. Its job is to decide: \[\text{Which anchors / predi...

Neural Processing Unit

An NPU is a Neural Processing Unit. It is a specialized chip designed to run neural network inference efficiently, especially on edge devices like phones, drones, cameras, robots, and embedded syst...

[BEV] DETR3D

DETR3D (CoRL 2021) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 6 camera images + calibration ↓ Shared 2D backbone ResNet / VoVNet ↓ FPN multi-scale 2D image feature...

[BEV] Feature Pyramid Network

FPN = Feature Pyramid Network