PyTorch’s Autograd Profiler provides information on the resources (CPU and GPU) for each operation in a model.
1
2
3
4
5
6
importtorch.autograd.profilerasprofilerwithprofiler.profile(use_cuda=True)asprof:# Your inference code here
outputs=model(dummy_input)print(prof.key_averages().table(sort_by="cuda_time_total"))
use_cuda=True enables event tracing
prof.key_averages() prints a table of TIME consumed
1
2
3
4
5
6
7
8
--------------------------------- --------------- --------------- ---------------
Name CPU total % CUDA total % # of Calls
--------------------------------- --------------- --------------- ---------------
aten::mm 30.00% 45.00% 10
aten::relu 10.00% 15.00% 10
aten::addmm 5.00% 8.00% 10
...
--------------------------------- --------------- --------------- ---------------