Rico's Nerd Cluster

「离开世界之前 一切都是过程」

[ML] Data Formats

Pickle, pth, onnx

Pickle A Python serialization format for saving objects to disk and loading them back later. Common use cases include models, dictionaries, lists, pandas DataFrames, preprocessing objects, and int...

[CUDA - 6] CUDA Functions

attomicAdd, pragma unroll

atomicAdd There’s no such thing as “atomic memory” in CUDA—atomicity is a property of an operation, not a memory type. atomicAdd is an instruction you apply to a memory location (typically in glob...

[CUDA - 5] SIMT in CUDA

SIMD SIMT

SIMT and SIMD CUDA is a very good embodiment of SIMD (Single-Instruction-Multiple Data). SIMD is great for addressing embarassingly parallel problems, problems that are so “embarassingly” simple, ...

[CUDA - 4] My First CUDA Kernel - Chamfer Distance

First CUDA Program, JIT Compile

On Chamfer Distance Chamfer distance measures how close two point clouds are by averaging nearest-neighbor distances in both directions. Given two point sets $P_1$ and $P_2$, For each point ...

[CUDA - 3] - Mixed Precision Training

AT_DISPATCH_FLOATING_TYPES_AND_HALF, gpuAtomicAdd

Run Time Type Dispatch AT_DISPATCH_FLOATING_TYPES_AND_HALF(dtype, name, lambda) expands into something like: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 switch (points_tensor.scalar_type()) { case...

[CUDA - 2] Introduction to CUDA Coding

CUDA Programming Hierarchy

What is Cuda CUDA is fundamentally C++ language with extensions. kernels, __global__, __device__ etc are defined in the C++ space. If you want to expose C to CUDA, just follow the standard 1 exte...

[CUDA - 2] CUDA Introduction

First CUDA Program, SIMD SIMT,

What is Cuda Cuda is fundamentally C++ language with extensiions. kernels, __global__, __device__ etc are defined in the C++ space. If you want to expose C to CUDA, just follow the standard 1 ext...

[CUDA - 1] GPU Architecture

GPU Architecture, Tensor Cores, Pinned Memory

A Great introduction video can be found here GPU (GA102) Architecture A graphics card’s brain is its GPU. NVidia’s Ampere GPU architecture family has GA102 and GA104. GA102 is shared across NVidi...

[Point Cloud Compression] - Sampling

Furthest Point Sampling Idea: given a set of points 1 P = {P1, P2, ...} Select 1 S = {s1, s2 ...} such that S is maximally spread out. How? Add an intial point in P to S. Then iterativ...

[ML] -Point-cloud-compression-1-Basic-Concepts

PSNR, Chamfer Distance

[Part 1] PSNR vs SNR In signal processing, the Signal-to-Noise Ratio (SNR) is defined as: \[\mathrm{SNR} = 10 \log_{10} \left( \frac{\mathrm{Var}(\text{signal})}{\mathrm{Var}(\text{noise})} \righ...