More inference speed updates (#19974)

This commit is contained in:
Nicolas Mowen 2025-09-08 10:39:33 -06:00 committed by GitHub
parent 880902cdd7
commit f46f8a2160
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -132,19 +132,19 @@ More information is available [in the detector docs](/configuration/object_detec
Inference speeds vary greatly depending on the CPU or GPU used, some known examples of GPU inference times are below: Inference speeds vary greatly depending on the CPU or GPU used, some known examples of GPU inference times are below:
| Name | MobileNetV2 Inference Time | YOLOv9 | YOLO-NAS Inference Time | RF-DETR Inference Time | Notes | | Name | MobileNetV2 Inference Time | YOLOv9 | YOLO-NAS Inference Time | RF-DETR Inference Time | Notes |
| -------------- | -------------------------- | --------------------------------------- | ------------------------- | ---------------------- | ---------------------------------- | | -------------- | -------------------------- | ------------------------------------------------- | ------------------------- | ---------------------- | ---------------------------------- |
| Intel HD 530 | 15 - 35 ms | | | | Can only run one detector instance | | Intel HD 530 | 15 - 35 ms | | | | Can only run one detector instance |
| Intel HD 620 | 15 - 25 ms | | 320: ~ 35 ms | | | | Intel HD 620 | 15 - 25 ms | | 320: ~ 35 ms | | |
| Intel HD 630 | ~ 15 ms | | 320: ~ 30 ms | | | | Intel HD 630 | ~ 15 ms | | 320: ~ 30 ms | | |
| Intel UHD 730 | ~ 10 ms | | 320: ~ 19 ms 640: ~ 54 ms | | | | Intel UHD 730 | ~ 10 ms | | 320: ~ 19 ms 640: ~ 54 ms | | |
| Intel UHD 770 | ~ 15 ms | t-320: 24 ms s-320: 30 ms s-640: 45 ms | 320: ~ 20 ms 640: ~ 46 ms | | | | Intel UHD 770 | ~ 15 ms | t-320: 24 ms s-320: 30 ms s-640: 45 ms | 320: ~ 20 ms 640: ~ 46 ms | | |
| Intel N100 | ~ 15 ms | s-320: 30 ms | 320: ~ 25 ms | | Can only run one detector instance | | Intel N100 | ~ 15 ms | s-320: 30 ms | 320: ~ 25 ms | | Can only run one detector instance |
| Intel N150 | ~ 15 ms | t-320: 16ms s-320: 24 ms | | | | | Intel N150 | ~ 15 ms | t-320: 16ms s-320: 24 ms | | | |
| Intel Iris XE | ~ 10 ms | s-320: 12 ms s-640: 30 ms | 320: ~ 18 ms 640: ~ 50 ms | | | | Intel Iris XE | ~ 10 ms | s-320: 12 ms s-640: 30 ms | 320: ~ 18 ms 640: ~ 50 ms | | |
| Intel Arc A310 | | s-320: 9 ms | | | | | Intel Arc A310 | | t-320: 7 ms t-640: 11 ms s-320: 8 ms s-640: 15 ms | | | |
| Intel Arc A380 | ~ 6 ms | | 320: ~ 10 ms 640: ~ 22 ms | 336: 20 ms 448: 27 ms | | | Intel Arc A380 | ~ 6 ms | | 320: ~ 10 ms 640: ~ 22 ms | 336: 20 ms 448: 27 ms | |
| Intel Arc A750 | ~ 4 ms | | 320: ~ 8 ms | | | | Intel Arc A750 | ~ 4 ms | | 320: ~ 8 ms | | |
### TensorRT - Nvidia GPU ### TensorRT - Nvidia GPU
@ -169,13 +169,13 @@ There are improved capabilities in newer GPU architectures that TensorRT can ben
Inference speeds will vary greatly depending on the GPU and the model used. Inference speeds will vary greatly depending on the GPU and the model used.
`tiny` variants are faster than the equivalent non-tiny model, some known examples are below: `tiny` variants are faster than the equivalent non-tiny model, some known examples are below:
| Name | YOLOv9 Inference Time | YOLO-NAS Inference Time | RF-DETR Inference Time | | Name | YOLOv9 Inference Time | YOLO-NAS Inference Time | RF-DETR Inference Time |
| --------------- | --------------------- | ------------------------- | ---------------------- | | --------------- | ------------------------- | ------------------------- | ---------------------- |
| GTX 1070 | s-320: 16 ms | 320: 14 ms | | | GTX 1070 | s-320: 16 ms | 320: 14 ms | |
| RTX 3050 | t-320: 15 ms | 320: ~ 10 ms 640: ~ 16 ms | Nano-320: ~ 12 ms | | RTX 3050 | t-320: 15 ms s-320: 17 ms | 320: ~ 10 ms 640: ~ 16 ms | Nano-320: ~ 12 ms |
| RTX 3070 | t-320: 11 ms | 320: ~ 8 ms 640: ~ 14 ms | Nano-320: ~ 9 ms | | RTX 3070 | t-320: 11 ms s-320: 13 ms | 320: ~ 8 ms 640: ~ 14 ms | Nano-320: ~ 9 ms |
| RTX A4000 | | 320: ~ 15 ms | | | RTX A4000 | | 320: ~ 15 ms | |
| Tesla P40 | | 320: ~ 105 ms | | | Tesla P40 | | 320: ~ 105 ms | |
### ROCm - AMD GPU ### ROCm - AMD GPU