Update tensorrt inference time docs (#19338)

* Update tensorrt inference times

* Update hardware.md
This commit is contained in:
Nicolas Mowen 2025-07-31 07:21:41 -06:00 committed by GitHub
parent 23b32cbacf
commit d18f2282c8
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -166,16 +166,12 @@ There are improved capabilities in newer GPU architectures that TensorRT can ben
Inference speeds will vary greatly depending on the GPU and the model used.
`tiny` variants are faster than the equivalent non-tiny model, some known examples are below:
| Name | YOLOv7 Inference Time | YOLO-NAS Inference Time | RF-DETR Inference Time |
| Name | YOLOv9 Inference Time | YOLO-NAS Inference Time | RF-DETR Inference Time |
| --------------- | --------------------- | ------------------------- | ------------------------- |
| GTX 1060 6GB | ~ 7 ms | | |
| GTX 1070 | ~ 6 ms | | |
| GTX 1660 SUPER | ~ 4 ms | | |
| RTX 3050 | 5 - 7 ms | 320: ~ 10 ms 640: ~ 16 ms | 336: ~ 16 ms 560: ~ 40 ms |
| RTX 3070 Mobile | ~ 5 ms | | |
| RTX 3070 | 4 - 6 ms | 320: ~ 6 ms 640: ~ 12 ms | 336: ~ 14 ms 560: ~ 36 ms |
| Quadro P400 2GB | 20 - 25 ms | | |
| Quadro P2000 | ~ 12 ms | | |
| RTX 3050 | 320: 15 ms | 320: ~ 10 ms 640: ~ 16 ms | 336: ~ 16 ms 560: ~ 40 ms |
| RTX 3070 | 320: 11 ms | 320: ~ 8 ms 640: ~ 14 ms | 336: ~ 14 ms 560: ~ 36 ms |
| RTX A4000 | | 320: ~ 15 ms | |
| Tesla P40 | | 320: ~ 105 ms | |
### ROCm - AMD GPU