Merge ab72a16048 into d5ce0ba73e

2025-07-26 13:47:03 +02:00 · 2025-07-26 12:51:17 +08:00 · 2025-07-26 12:51:17 +08:00 · 5f2589b498
commit 5f2589b498
parent d5ce0ba73e ab72a16048
3 changed files with 233 additions and 0 deletions
--- a/docker/main/requirements-wheels.txt
+++ b/docker/main/requirements-wheels.txt
@ -71,3 +71,5 @@ prometheus-client == 0.21.*
 # TFLite
 tflite_runtime @ https://github.com/frigate-nvr/TFlite-builds/releases/download/v2.17.1/tflite_runtime-2.17.1-cp311-cp311-linux_x86_64.whl; platform_machine == 'x86_64'
 tflite_runtime @ https://github.com/feranick/TFlite-builds/releases/download/v2.17.1/tflite_runtime-2.17.1-cp311-cp311-linux_aarch64.whl; platform_machine == 'aarch64'
+# DeGirum detector
+degirum == 0.16.*
--- a/docs/docs/configuration/object_detectors.md
+++ b/docs/docs/configuration/object_detectors.md
@ -13,6 +13,7 @@ Frigate supports multiple different detectors that work on different types of ha

 - [Coral EdgeTPU](#edge-tpu-detector): The Google Coral EdgeTPU is available in USB and m.2 format allowing for a wide range of compatibility with devices.
 - [Hailo](#hailo-8): The Hailo8 and Hailo8L AI Acceleration module is available in m.2 format with a HAT for RPi devices, offering a wide range of compatibility with devices.
+- [DeGirum](#degirum): Service for using hardware devices in the cloud or locally. Hardware and models provided on the cloud on [their website](https://hub.degirum.com).

 **AMD**

@ -950,6 +951,101 @@ Explanation of the paramters:
  - **example**: Specifying `output_name = "frigate-{quant}-{input_basename}-{soc}-v{tk_version}"` could result in a model called `frigate-i8-my_model-rk3588-v2.3.0.rknn`.
 - `config`: Configuration passed to `rknn-toolkit2` for model conversion. For an explanation of all available parameters have a look at section "2.2. Model configuration" of [this manual](https://github.com/MarcA711/rknn-toolkit2/releases/download/v2.3.2/03_Rockchip_RKNPU_API_Reference_RKNN_Toolkit2_V2.3.2_EN.pdf).

+## DeGirum
+
+DeGirum is a detector that can use any type of hardware listed on [their website](https://hub.degirum.com). DeGirum can be used with local hardware through a DeGirum AI Server, or through the use of `@local`. You can also connect directly to DeGirum's AI Hub to run inferences. **Please Note:** This detector *cannot* be used for commercial purposes.
+
+### Configuration
+
+#### AI Server Inference
+
+Before starting with the config file for this section, you must first launch an AI server. DeGirum has an AI server ready to use as a docker container. Add this to your `docker-compose.yml` to get started:
+```yaml
+degirum_detector:
+    container_name: degirum
+    image: degirum/aiserver:latest
+    privileged: true
+    ports:
+      - "8778:8778"
+```
+All supported hardware will automatically be found on your AI server host as long as relevant runtimes and drivers are properly installed on your machine. Refer to [DeGirum's docs site](https://docs.degirum.com/pysdk/runtimes-and-drivers) if you have any trouble.
+
+Once completed, changing the `config.yml` file is simple.
+```yaml
+degirum_detector:
+    type: degirum
+    location: degirum # Set to service name (degirum_detector), container_name (degirum), or a host:port (192.168.29.4:8778)
+    zoo: degirum/public # DeGirum's public model zoo. Zoo name should be in format "workspace/zoo_name". degirum/public is available to everyone, so feel free to use it if you don't know where to start. If you aren't pulling a model from the AI Hub, leave this and 'token' blank.
+    token: dg_example_token # For authentication with the AI Hub. Get this token through the "tokens" section on the main page of the [AI Hub](https://hub.degirum.com). This can be left blank if you're pulling a model from the public zoo and running inferences on your local hardware using @local or a local DeGirum AI Server
+```
+Setting up a model in the `config.yml` is similar to setting up an AI server.
+You can set it to:
+- A model listed on the [AI Hub](https://hub.degirum.com), given that the correct zoo name is listed in your detector
+    - If this is what you choose to do, the correct model will be downloaded onto your machine before running.
+- A local directory acting as a zoo. See DeGirum's docs site [for more information](https://docs.degirum.com/pysdk/user-guide-pysdk/organizing-models#model-zoo-directory-structure).
+- A path to some model.json.
+```yaml
+model:
+    path: ./mobilenet_v2_ssd_coco--300x300_quant_n2x_orca1_1 # directory to model .json and file
+    width: 300 # width is in the model name as the first number in the "int"x"int" section
+    height: 300 # height is in the model name as the second number in the "int"x"int" section
+    input_pixel_format: rgb/bgr # look at the model.json to figure out which to put here
+```
+
+
+#### Local Inference
+
+It is also possible to eliminate the need for an AI server and run the hardware directly. The benefit of this approach is that you eliminate any bottlenecks that occur when transferring prediction results from the AI server docker container to the frigate one. However, the method of implementing local inference is different for every device and hardware combination, so it's usually more trouble than it's worth. A general guideline to achieve this would be:
+1. Ensuring that the frigate docker container has the runtime you want to use. So for instance, running `@local` for Hailo means making sure the container you're using has the Hailo runtime installed.
+2. To double check the runtime is detected by the DeGirum detector, make sure the `degirum sys-info` command properly shows whatever runtimes you mean to install.
+3. Create a DeGirum detector in your `config.yml` file.
+
+```yaml
+degirum_detector:
+    type: degirum
+    location: "@local" # For accessing AI Hub devices and models
+    zoo: degirum/public # DeGirum's public model zoo. Zoo name should be in format "workspace/zoo_name". degirum/public is available to everyone, so feel free to use it if you don't know where to start.
+    token: dg_example_token # For authentication with the AI Hub. Get this token through the "tokens" section on the main page of the [AI Hub](https://hub.degirum.com). This can be left blank if you're pulling a model from the public zoo and running inferences on your local hardware using @local or a local DeGirum AI Server
+
+```
+
+Once `degirum_detector` is setup, you can choose a model through 'model' section in the `config.yml` file.
+
+```yaml
+model:
+    path: mobilenet_v2_ssd_coco--300x300_quant_n2x_orca1_1
+    width: 300 # width is in the model name as the first number in the "int"x"int" section
+    height: 300 # height is in the model name as the second number in the "int"x"int" section
+    input_pixel_format: rgb/bgr # look at the model.json to figure out which to put here
+```
+
+
+#### AI Hub Cloud Inference
+
+If you do not possess whatever hardware you want to run, there's also the option to run cloud inferences. Do note that your detection fps might need to be lowered as network latency does significantly slow down this method of detection. For use with Frigate, we highly recommend using a local AI server as described above. To set up cloud inferences,
+1. Sign up at [DeGirum's AI Hub](https://hub.degirum.com).
+2. Get an access token.
+3. Create a DeGirum detector in your `config.yml` file.
+
+```yaml
+degirum_detector:
+    type: degirum
+    location: "@cloud" # For accessing AI Hub devices and models
+    zoo: degirum/public # DeGirum's public model zoo. Zoo name should be in format "workspace/zoo_name". degirum/public is available to everyone, so feel free to use it if you don't know where to start.
+    token: dg_example_token # For authentication with the AI Hub. Get this token through the "tokens" section on the main page of the (AI Hub)[https://hub.degirum.com).
+
+```
+
+Once `degirum_detector` is setup, you can choose a model through 'model' section in the `config.yml` file.
+
+```yaml
+model:
+    path: mobilenet_v2_ssd_coco--300x300_quant_n2x_orca1_1
+    width: 300 # width is in the model name as the first number in the "int"x"int" section
+    height: 300 # height is in the model name as the second number in the "int"x"int" section
+    input_pixel_format: rgb/bgr # look at the model.json to figure out which to put here
+```
+
 # Models

 Some model types are not included in Frigate by default.
--- a/frigate/detectors/plugins/degirum.py
+++ b/frigate/detectors/plugins/degirum.py
@ -0,0 +1,135 @@
+import logging
+import queue
+
+import degirum as dg
+import numpy as np
+from pydantic import Field
+from typing_extensions import Literal
+
+from frigate.detectors.detection_api import DetectionApi
+from frigate.detectors.detector_config import BaseDetectorConfig
+
+logger = logging.getLogger(__name__)
+DETECTOR_KEY = "degirum"
+
+
+### DETECTOR CONFIG ###
+class DGDetectorConfig(BaseDetectorConfig):
+    type: Literal[DETECTOR_KEY]
+    location: str = Field(default=None, title="Inference Location")
+    zoo: str = Field(default=None, title="Model Zoo")
+    token: str = Field(default=None, title="DeGirum Cloud Token")
+
+
+### ACTUAL DETECTOR  ###
+class DGDetector(DetectionApi):
+    type_key = DETECTOR_KEY
+
+    def __init__(self, detector_config: DGDetectorConfig):
+        self._queue = queue.Queue()
+        self._zoo = dg.connect(
+            detector_config.location, detector_config.zoo, detector_config.token
+        )
+
+        logger.debug(f"Models in zoo: {self._zoo.list_models()}")
+
+        self.dg_model = self._zoo.load_model(
+            detector_config.model.path,
+        )
+
+        # Setting input image format to raw reduces preprocessing time
+        self.dg_model.input_image_format = "RAW"
+
+        # Prioritize the most powerful hardware available
+        self.select_best_device_type()
+        # Frigate handles pre processing as long as these are all set
+        input_shape = self.dg_model.input_shape[0]
+        self.model_height = input_shape[1]
+        self.model_width = input_shape[2]
+
+        # Passing in dummy frame so initial connection latency happens in
+        # init function and not during actual prediction
+        frame = np.zeros(
+            (detector_config.model.width, detector_config.model.height, 3),
+            dtype=np.uint8,
+        )
+        # Pass in frame to overcome first frame latency
+        self.dg_model(frame)
+        self.prediction = self.prediction_generator()
+
+    def select_best_device_type(self):
+        """
+        Helper function that selects fastest hardware available per model runtime
+        """
+        types = self.dg_model.supported_device_types
+
+        device_map = {
+            "OPENVINO": ["GPU", "NPU", "CPU"],
+            "HAILORT": ["HAILO8L", "HAILO8"],
+            "N2X": ["ORCA1", "CPU"],
+            "ONNX": ["VITIS_NPU", "CPU"],
+            "RKNN": ["RK3566", "RK3568", "RK3588"],
+            "TENSORRT": ["DLA", "GPU", "DLA_ONLY"],
+            "TFLITE": ["ARMNN", "EDGETPU", "CPU"],
+        }
+
+        runtime = types[0].split("/")[0]
+        # Just create an array of format {runtime}/{hardware} for every hardware
+        # in the value for appropriate key in device_map
+        self.dg_model.device_type = [
+            f"{runtime}/{hardware}" for hardware in device_map[runtime]
+        ]
+
+    def prediction_generator(self):
+        """
+        Generator for all incoming frames. By using this generator, we don't have to keep
+        reconnecting our websocket on every "predict" call.
+        """
+        logger.debug("Prediction generator was called")
+        with self.dg_model as model:
+            while 1:
+                logger.info(f"q size before calling get: {self._queue.qsize()}")
+                data = self._queue.get(block=True)
+                logger.info(f"q size after calling get: {self._queue.qsize()}")
+                logger.debug(
+                    f"Data we're passing into model predict: {data}, shape of data: {data.shape}"
+                )
+                result = model.predict(data)
+                logger.debug(f"Prediction result: {result}")
+                yield result
+
+    def detect_raw(self, tensor_input):
+        # Reshaping tensor to work with pysdk
+        truncated_input = tensor_input.reshape(tensor_input.shape[1:])
+        logger.debug(f"Detect raw was called for tensor input: {tensor_input}")
+
+        # add tensor_input to input queue
+        self._queue.put(truncated_input)
+        logger.debug(f"Queue size after adding truncated input: {self._queue.qsize()}")
+
+        # define empty detection result
+        detections = np.zeros((20, 6), np.float32)
+        # grab prediction
+        res = next(self.prediction)
+
+        # If we have an empty prediction, return immediately
+        if len(res.results) == 0 or len(res.results[0]) == 0:
+            return detections
+
+        i = 0
+        for result in res.results:
+            if i >= 20:
+                break
+
+            detections[i] = [
+                result["category_id"],
+                float(result["score"]),
+                result["bbox"][1] / self.model_height,
+                result["bbox"][0] / self.model_width,
+                result["bbox"][3] / self.model_height,
+                result["bbox"][2] / self.model_width,
+            ]
+            i += 1
+
+        logger.debug(f"Detections output: {detections}")
+        return detections