Training
- Label a small but diverse calipers dataset.
- Fine-tune RF-DETR Large.
- Run it on new/unlabeled videos.
- Find where it fails.
- Add those failed frames back into the training set.
- Retrain
- False negative: Calipers are visible, but RF-DETR does not detect them. Add this frame to training with a correct calipers box.
- False positive No calipers are present, but the model says something is calipers. Add this frame as a negative frame: image with zero boxes.
Tools: (5000 images)
- Blender / BlenderProc: Very common for research.
- NVIDIA Isaac Sim / Omniverse
- a simpler OpenGL / pyrender style pipeline can work.
- Render the CAD over random backgrounds. object render + real background image compositing
Then you have: http://real_train/real_val/real_test/
Data Augmentation
- Randomize:
1
matte grayblack plasticmetallic grayslightly rough surfaceslightly glossy surfacedifferent brightnessdifferent background color
- Fix:
- camera distance
- 50-60%: full object visible, centered-ish
- 20-30%: object smaller, with table/background10-20%: partial crop / occlusion / close-up object occupies 20-70% of image width
- object scale in image
- full vs cropped views
- background/tablebbox correctness
- camera distance
get segmentation:
1
2
3
4
5
6
writer.initialize(
output_dir=OUTPUT_DIR,
rgb=True,
bounding_box_2d_tight=True,
semantic_segmentation=True,
)