Implement YOLOx for RKNN (#17788)

* Implement yolox rknn inference and post processing * rework docs
2025-12-06 20:05:16 +01:00 · 2025-04-18 14:44:02 -06:00 · 2025-04-18 14:44:02 -06:00 · 1cdc9b6097
commit 1cdc9b6097
parent 68382d89b4
4 changed files with 169 additions and 65 deletions
--- a/docs/docs/configuration/object_detectors.md
+++ b/docs/docs/configuration/object_detectors.md
@ -815,62 +815,7 @@ This implementation uses the [Rockchip's RKNN-Toolkit2](https://github.com/airoc

 ### Prerequisites

-Make sure to follow the [Rockchip specific installation instrucitions](/frigate/installation#rockchip-platform).
-
-### Configuration
-
-This `config.yml` shows all relevant options to configure the detector and explains them. All values shown are the default values (except for two). Lines that are required at least to use the detector are labeled as required, all other lines are optional.
-
-```yaml
-detectors: # required
-  rknn: # required
-    type: rknn # required
-    # number of NPU cores to use
-    # 0 means choose automatically
-    # increase for better performance if you have a multicore NPU e.g. set to 3 on rk3588
-    num_cores: 0
-
-model: # required
-  # name of model (will be automatically downloaded) or path to your own .rknn model file
-  # possible values are:
-  # - deci-fp16-yolonas_s
-  # - deci-fp16-yolonas_m
-  # - deci-fp16-yolonas_l
-  # - /config/model_cache/your_custom_model.rknn
-  path: deci-fp16-yolonas_s
-  # width and height of detection frames
-  width: 320
-  height: 320
-  # pixel format of detection frame
-  # default value is rgb but yolo models usually use bgr format
-  input_pixel_format: bgr # required
-  # shape of detection frame
-  input_tensor: nhwc
-  # needs to be adjusted to model, see below
-  labelmap_path: /labelmap.txt # required
-```
-
-The correct labelmap must be loaded for each model. If you use a custom model (see notes below), you must make sure to provide the correct labelmap. The table below lists the correct paths for the bundled models:
-
-| `path`                | `labelmap_path`       |
-| --------------------- | --------------------- |
-| deci-fp16-yolonas\_\* | /labelmap/coco-80.txt |
-
-### Choosing a model
-
-:::warning
-
-The pre-trained YOLO-NAS weights from DeciAI are subject to their license and can't be used commercially. For more information, see: https://docs.deci.ai/super-gradients/latest/LICENSE.YOLONAS.html
-
-:::
-
-The inference time was determined on a rk3588 with 3 NPU cores.
-
-| Model               | Size in mb | Inference time in ms |
-| ------------------- | ---------- | -------------------- |
-| deci-fp16-yolonas_s | 24         | 25                   |
-| deci-fp16-yolonas_m | 62         | 35                   |
-| deci-fp16-yolonas_l | 81         | 45                   |
+Make sure to follow the [Rockchip specific installation instructions](/frigate/installation#rockchip-platform).

 :::tip

@ -883,9 +828,71 @@ $ cat /sys/kernel/debug/rknpu/load

 :::

+### Supported Models
+
+This `config.yml` shows all relevant options to configure the detector and explains them. All values shown are the default values (except for two). Lines that are required at least to use the detector are labeled as required, all other lines are optional.
+
+```yaml
+detectors: # required
+  rknn: # required
+    type: rknn # required
+    # number of NPU cores to use
+    # 0 means choose automatically
+    # increase for better performance if you have a multicore NPU e.g. set to 3 on rk3588
+    num_cores: 0
+```
+
+The inference time was determined on a rk3588 with 3 NPU cores.
+
+| Model               | Size in mb | Inference time in ms |
+| ------------------- | ---------- | -------------------- |
+| deci-fp16-yolonas_s | 24         | 25                   |
+| deci-fp16-yolonas_m | 62         | 35                   |
+| deci-fp16-yolonas_l | 81         | 45                   |
+| yolox_nano          | 3          | 16                   |
+| yolox_tiny          | 6          | 20                   |
+
 - All models are automatically downloaded and stored in the folder `config/model_cache/rknn_cache`. After upgrading Frigate, you should remove older models to free up space.
 - You can also provide your own `.rknn` model. You should not save your own models in the `rknn_cache` folder, store them directly in the `model_cache` folder or another subfolder. To convert a model to `.rknn` format see the `rknn-toolkit2` (requires a x86 machine). Note, that there is only post-processing for the supported models.

+#### YOLO-NAS
+
+```yaml
+model: # required
+  # name of model (will be automatically downloaded) or path to your own .rknn model file
+  # possible values are:
+  # - deci-fp16-yolonas_s
+  # - deci-fp16-yolonas_m
+  # - deci-fp16-yolonas_l
+  path: deci-fp16-yolonas_s
+  width: 320
+  height: 320
+  input_pixel_format: bgr
+  input_tensor: nhwc
+  labelmap_path: /labelmap/coco-80.txt
+```
+
+:::warning
+
+The pre-trained YOLO-NAS weights from DeciAI are subject to their license and can't be used commercially. For more information, see: https://docs.deci.ai/super-gradients/latest/LICENSE.YOLONAS.html
+
+:::
+
+#### YOLOx
+
+```yaml
+model: # required
+  # name of model (will be automatically downloaded) or path to your own .rknn model file
+  # possible values are:
+  # - yolox_nano
+  # - yolox_tiny
+  path: yolox_tiny
+  width: 416
+  height: 416
+  input_tensor: nhwc
+  labelmap_path: /labelmap/coco-80.txt
+```
+
 ### Converting your own onnx model to rknn format

 To convert a onnx model to the rknn format using the [rknn-toolkit2](https://github.com/airockchip/rknn-toolkit2/) you have to:
--- a/frigate/detectors/detection_api.py
+++ b/frigate/detectors/detection_api.py
@ -24,7 +24,7 @@ class DetectionApi(ABC):
    def detect_raw(self, tensor_input):
        pass

-    def calculate_grids_strides(self) -> None:
+    def calculate_grids_strides(self, expanded=True) -> None:
        grids = []
        expanded_strides = []

@ -35,10 +35,23 @@ class DetectionApi(ABC):

        for hsize, wsize, stride in zip(hsizes, wsizes, strides):
            xv, yv = np.meshgrid(np.arange(wsize), np.arange(hsize))
-            grid = np.stack((xv, yv), 2).reshape(1, -1, 2)
-            grids.append(grid)
-            shape = grid.shape[:2]
-            expanded_strides.append(np.full((*shape, 1), stride))

-        self.grids = np.concatenate(grids, 1)
-        self.expanded_strides = np.concatenate(expanded_strides, 1)
+            if expanded:
+                grid = np.stack((xv, yv), 2).reshape(1, -1, 2)
+                grids.append(grid)
+                shape = grid.shape[:2]
+                expanded_strides.append(np.full((*shape, 1), stride))
+            else:
+                xv = xv.reshape(1, 1, hsize, wsize)
+                yv = yv.reshape(1, 1, hsize, wsize)
+                grids.extend(np.concatenate((xv, yv), axis=1).tolist())
+                expanded_strides.extend(
+                    np.array([stride, stride]).reshape(1, 2, 1, 1).tolist()
+                )
+
+        if expanded:
+            self.grids = np.concatenate(grids, 1)
+            self.expanded_strides = np.concatenate(expanded_strides, 1)
+        else:
+            self.grids = grids
+            self.expanded_strides = expanded_strides
--- a/frigate/detectors/plugins/rknn.py
+++ b/frigate/detectors/plugins/rknn.py
@ -4,6 +4,7 @@ import re
 import urllib.request
 from typing import Literal

+import cv2
 import numpy as np
 from pydantic import Field

@ -17,7 +18,10 @@ DETECTOR_KEY = "rknn"

 supported_socs = ["rk3562", "rk3566", "rk3568", "rk3576", "rk3588"]

-supported_models = {ModelTypeEnum.yolonas: "^deci-fp16-yolonas_[sml]$"}
+supported_models = {
+    ModelTypeEnum.yolonas: "^deci-fp16-yolonas_[sml]$",
+    ModelTypeEnum.yolox: None,
+}

 model_cache_dir = os.path.join(MODEL_CACHE_DIR, "rknn_cache/")

@ -41,6 +45,9 @@ class Rknn(DetectionApi):

        model_props = self.parse_model_input(model_path, soc)

+        if self.detector_config.model.model_type == ModelTypeEnum.yolox:
+            self.calculate_grids_strides(expanded=False)
+
        if model_props["preset"]:
            config.model.model_type = model_props["model_type"]

@ -199,9 +206,86 @@ class Rknn(DetectionApi):

        return np.resize(results, (20, 6))

+    def post_process_yolox(
+        self,
+        predictions: list[np.ndarray],
+        grids: np.ndarray,
+        expanded_strides: np.ndarray,
+    ) -> np.ndarray:
+        def sp_flatten(_in: np.ndarray):
+            ch = _in.shape[1]
+            _in = _in.transpose(0, 2, 3, 1)
+            return _in.reshape(-1, ch)
+
+        boxes, scores, classes_conf = [], [], []
+
+        input_data = [
+            _in.reshape([1, -1] + list(_in.shape[-2:])) for _in in predictions
+        ]
+
+        for i in range(len(input_data)):
+            unprocessed_box = input_data[i][:, :4, :, :]
+            box_xy = unprocessed_box[:, :2, :, :]
+            box_wh = np.exp(unprocessed_box[:, 2:4, :, :]) * expanded_strides[i]
+
+            box_xy += grids[i]
+            box_xy *= expanded_strides[i]
+            box = np.concatenate((box_xy, box_wh), axis=1)
+
+            # Convert [c_x, c_y, w, h] to [x1, y1, x2, y2]
+            xyxy = np.copy(box)
+            xyxy[:, 0, :, :] = box[:, 0, :, :] - box[:, 2, :, :] / 2  # top left x
+            xyxy[:, 1, :, :] = box[:, 1, :, :] - box[:, 3, :, :] / 2  # top left y
+            xyxy[:, 2, :, :] = box[:, 0, :, :] + box[:, 2, :, :] / 2  # bottom right x
+            xyxy[:, 3, :, :] = box[:, 1, :, :] + box[:, 3, :, :] / 2  # bottom right y
+
+            boxes.append(xyxy)
+            scores.append(input_data[i][:, 4:5, :, :])
+            classes_conf.append(input_data[i][:, 5:, :, :])
+
+        # flatten data
+        boxes = np.concatenate([sp_flatten(_v) for _v in boxes])
+        classes_conf = np.concatenate([sp_flatten(_v) for _v in classes_conf])
+        scores = np.concatenate([sp_flatten(_v) for _v in scores])
+
+        # reshape and filter boxes
+        box_confidences = scores.reshape(-1)
+        class_max_score = np.max(classes_conf, axis=-1)
+        classes = np.argmax(classes_conf, axis=-1)
+        _class_pos = np.where(class_max_score * box_confidences >= 0.4)
+        scores = (class_max_score * box_confidences)[_class_pos]
+        boxes = boxes[_class_pos]
+        classes = classes[_class_pos]
+
+        # run nms
+        indices = cv2.dnn.NMSBoxes(
+            bboxes=boxes,
+            scores=scores,
+            score_threshold=0.4,
+            nms_threshold=0.4,
+        )
+
+        results = np.zeros((20, 6), np.float32)
+
+        if len(indices) > 0:
+            for i, idx in enumerate(indices.flatten()[:20]):
+                box = boxes[idx]
+                results[i] = [
+                    classes[idx],
+                    scores[idx],
+                    box[1] / self.height,
+                    box[0] / self.width,
+                    box[3] / self.height,
+                    box[2] / self.width,
+                ]
+
+        return results
+
    def post_process(self, output):
        if self.detector_config.model.model_type == ModelTypeEnum.yolonas:
            return self.post_process_yolonas(output)
+        elif self.detector_config.model.model_type == ModelTypeEnum.yolox:
+            return self.post_process_yolox(output, self.grids, self.expanded_strides)
        else:
            raise ValueError(
                f'Model type "{self.detector_config.model.model_type}" is currently not supported.'
--- a/frigate/util/model.py
+++ b/frigate/util/model.py
@ -180,7 +180,7 @@ def __post_process_multipart_yolo(
                x2 / width,
            ]

-    return np.array(results, dtype=np.float32)
+    return results


 def __post_process_nms_yolo(predictions: np.ndarray, width, height) -> np.ndarray: