Rico's Nerd Cluster

「离开世界之前 一切都是过程」

[ML] FoundationPose (CVPR 2024)

Introduction Code, paper To reduce manual effort for large-scale training, FoundationPose introduces a synthetic data generation pipeline built on 3D model databases (G...

[Robotics] From Market-Ready ROVs to Low-Cost AUVs: Building Practical Underwater Autonomy

This is a summary of From Market-Ready ROVs to Low-Cost AUVs BlueROV2 is a small 6-DoF ROV ROV platform that can operate at depths up to 100 meters, weighs under 12 kg, and includes a tether of up...

DETR

Breakthroughs of RF-DETR (2026) 这不是小幅迭代,这是质的飞跃。 60 mAP,意味着它能识别的物体更准、漏检更少、误报率更低。 而6.8ms的延迟,意味着它可以实时跑在普通GPU上。 这两个指标同时满足,业界等了整整五年。 你可能有个疑问:为什么这次突破来自DETR家族,而不是YOLO系列? 因为架构路线不同。 YOLO走的是CNN路线,擅长速度...

DETR

Introduction Here is the general architecture of DETR, which is quite straight-forward: First is a CNN to extract features (256-Vector) Second is a transformer to learn bounding boxes usin...

Data Augmentation For Object Detection

TODO YOLOv8 employs a series of data augmentation techniques, including Mosaic, Mixup, random perspective transformation, and HSV augmentation. Additionally, the input images undergo meticulous pr...

FEB-YOLOv8

FEB-YOLOv8 tries to make YOLOv8n small enough for underwater robots, then gives back the lost accuracy using attention and better small-object feature fusion. The three main changes are P-C2f, EMA ...

Muon SGD

Muon means MomentUm Orthogonalized by Newton–Schulz. In practice, it takes the usual SGD momentum update for a 2D weight matrix, then orthogonalizes that update before applying it. NVIDIA’s Emergin...

Progloss

ProgLoss is a training-time loss schedule for object detection. Instead of keeping classification and localization weights fixed for all epochs, it changes them over training progress: \[L_{\opera...

Small Target Label Assignment (STAL)

STAL STAL = Small-Target-Aware Label Assignment. It is a training-time matching rule for object detectors: it decides which predicted boxes / anchors should be treated as positive samples for each...

Target Label Assignment (STAL)

TOOD

TAL TAL usually means Task-Aligned Label Assignment or Task Alignment Learning in object detection. It is used during training, not inference. Its job is to decide: \[\text{Which anchors / predi...