Update tensorrt inference time docs (#19338)

* Update tensorrt inference times

* Update hardware.md
This commit is contained in:
Nicolas Mowen 2025-07-31 07:21:41 -06:00 committed by GitHub
parent 23b32cbacf
commit d18f2282c8
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -166,16 +166,12 @@ There are improved capabilities in newer GPU architectures that TensorRT can ben
Inference speeds will vary greatly depending on the GPU and the model used. Inference speeds will vary greatly depending on the GPU and the model used.
`tiny` variants are faster than the equivalent non-tiny model, some known examples are below: `tiny` variants are faster than the equivalent non-tiny model, some known examples are below:
| Name | YOLOv7 Inference Time | YOLO-NAS Inference Time | RF-DETR Inference Time | | Name | YOLOv9 Inference Time | YOLO-NAS Inference Time | RF-DETR Inference Time |
| --------------- | --------------------- | ------------------------- | ------------------------- | | --------------- | --------------------- | ------------------------- | ------------------------- |
| GTX 1060 6GB | ~ 7 ms | | | | RTX 3050 | 320: 15 ms | 320: ~ 10 ms 640: ~ 16 ms | 336: ~ 16 ms 560: ~ 40 ms |
| GTX 1070 | ~ 6 ms | | | | RTX 3070 | 320: 11 ms | 320: ~ 8 ms 640: ~ 14 ms | 336: ~ 14 ms 560: ~ 36 ms |
| GTX 1660 SUPER | ~ 4 ms | | | | RTX A4000 | | 320: ~ 15 ms | |
| RTX 3050 | 5 - 7 ms | 320: ~ 10 ms 640: ~ 16 ms | 336: ~ 16 ms 560: ~ 40 ms | | Tesla P40 | | 320: ~ 105 ms | |
| RTX 3070 Mobile | ~ 5 ms | | |
| RTX 3070 | 4 - 6 ms | 320: ~ 6 ms 640: ~ 12 ms | 336: ~ 14 ms 560: ~ 36 ms |
| Quadro P400 2GB | 20 - 25 ms | | |
| Quadro P2000 | ~ 12 ms | | |
### ROCm - AMD GPU ### ROCm - AMD GPU