Apple Silicon / ZMQ Detector (#19592)

* Add zmq detector * Cleanup * Logging * Cleanup * Cleanup * Add to hardware docs * Add apple silicon to docs * Formatting
2025-08-27 13:47:50 +02:00 · 2025-08-18 08:51:12 -06:00 · 2025-08-18 08:51:12 -06:00 · 152d9ed4a0
commit 152d9ed4a0
parent 5a49d1f73c
3 changed files with 238 additions and 9 deletions
--- a/docs/docs/configuration/object_detectors.md
+++ b/docs/docs/configuration/object_detectors.md
@ -19,6 +19,10 @@ Frigate supports multiple different detectors that work on different types of ha
 - [ROCm](#amdrocm-gpu-detector): ROCm can run on AMD Discrete GPUs to provide efficient object detection.
 - [ONNX](#onnx): ROCm will automatically be detected and used as a detector in the `-rocm` Frigate image when a supported ONNX model is configured.

+**Apple Silicon**
+
+- [Apple Silicon](#apple-silicon-detector): Apple Silicon can run on M1 and newer Apple Silicon devices.
+
 **Intel**

 - [OpenVino](#openvino-detector): OpenVino can run on Intel Arc GPUs, Intel integrated GPUs, and Intel CPUs to provide efficient object detection.
@ -264,7 +268,7 @@ detectors:

 :::

-### Supported Models
+### OpenVINO Supported Models

 #### SSDLite MobileNet v2

@ -402,6 +406,59 @@ model:

 Note that the labelmap uses a subset of the complete COCO label set that has only 80 objects.

+## Apple Silicon detector
+
+The NPU in Apple Silicon can't be accessed from within a container, so the [Apple Silicon detector client](https://github.com/frigate-nvr/apple-silicon-detector) must fist be setup. It is recommended to use the Frigate docker image with `-standard-arm64` suffix, for example  `ghcr.io/blakeblackshear/frigate:stable-arm64-standard`.
+
+### Setup
+
+1. Setup the [Apple Silicon detector client](https://github.com/frigate-nvr/apple-silicon-detector) and run the client
+2. Configure the detector in Frigate and startup Frigate
+
+### Configuration
+
+Using the detector config below will connect to the client:
+
+```yaml
+detectors:
+  apple-silicon:
+    type: zmq
+    endpoint: tcp://host.docker.internal:5555
+```
+
+### Apple Silicon Supported Models
+
+There is no default model provided, the following formats are supported:
+
+#### YOLO (v3, v4, v7, v9)
+
+YOLOv3, YOLOv4, YOLOv7, and [YOLOv9](https://github.com/WongKinYiu/yolov9) models are supported, but not included by default.
+
+:::tip
+
+The YOLO detector has been designed to support YOLOv3, YOLOv4, YOLOv7, and YOLOv9 models, but may support other YOLO model architectures as well. See [the models section](#downloading-yolo-models) for more information on downloading YOLO models for use in Frigate.
+
+:::
+
+After placing the downloaded onnx model in your config folder, you can use the following configuration:
+
+```yaml
+detectors:
+  onnx:
+    type: onnx
+
+model:
+  model_type: yolo-generic
+  width: 320 # <--- should match the imgsize set during model export
+  height: 320 # <--- should match the imgsize set during model export
+  input_tensor: nchw
+  input_dtype: float
+  path: /config/model_cache/yolo.onnx
+  labelmap_path: /labelmap/coco-80.txt
+```
+
+Note that the labelmap uses a subset of the complete COCO label set that has only 80 objects.
+
 ## AMD/ROCm GPU detector

 ### Setup
@ -483,7 +540,7 @@ We unset the `HSA_OVERRIDE_GFX_VERSION` to prevent an existing override from mes
 $ docker exec -it frigate /bin/bash -c '(unset HSA_OVERRIDE_GFX_VERSION && /opt/rocm/bin/rocminfo |grep gfx)'
 ```

-### Supported Models
+### ROCm Supported Models

 See [ONNX supported models](#supported-models) for supported models, there are some caveats:

@ -526,7 +583,7 @@ detectors:

 :::

-### Supported Models
+### ONNX Supported Models

 There is no default model provided, the following formats are supported:

@ -824,7 +881,7 @@ $ cat /sys/kernel/debug/rknpu/load

 :::

-### Supported Models
+### RockChip Supported Models

 This `config.yml` shows all relevant options to configure the detector and explains them. All values shown are the default values (except for two). Lines that are required at least to use the detector are labeled as required, all other lines are optional.

--- a/docs/docs/frigate/hardware.md
+++ b/docs/docs/frigate/hardware.md
@ -61,19 +61,26 @@ Frigate supports multiple different detectors that work on different types of ha
 **AMD**

 - [ROCm](#rocm---amd-gpu): ROCm can run on AMD Discrete GPUs to provide efficient object detection
-  - [Supports limited model architectures](../../configuration/object_detectors#supported-models-1)
+  - [Supports limited model architectures](../../configuration/object_detectors#rocm-supported-models)
  - Runs best on discrete AMD GPUs

+**Apple Silicon**
+
+- [Apple Silicon](#apple-silicon): Apple Silicon is usable on all M1 and newer Apple Silicon devices to provide efficient and fast object detection
+  - [Supports primarily ssdlite and mobilenet model architectures](../../configuration/object_detectors#apple-silicon-supported-models)
+  - Runs well with any size models including large
+  - Runs via ZMQ proxy which adds some latency, only recommended for local connection
+
 **Intel**

 - [OpenVino](#openvino---intel): OpenVino can run on Intel Arc GPUs, Intel integrated GPUs, and Intel CPUs to provide efficient object detection.
-  - [Supports majority of model architectures](../../configuration/object_detectors#supported-models)
+  - [Supports majority of model architectures](../../configuration/object_detectors#openvino-supported-models)
  - Runs best with tiny, small, or medium models

 **Nvidia**

 - [TensortRT](#tensorrt---nvidia-gpu): TensorRT can run on Nvidia GPUs and Jetson devices.
-  - [Supports majority of model architectures via ONNX](../../configuration/object_detectors#supported-models-2)
+  - [Supports majority of model architectures via ONNX](../../configuration/object_detectors#onnx-supported-models)
  - Runs well with any size models including large

 **Rockchip**
@ -173,14 +180,28 @@ Inference speeds will vary greatly depending on the GPU and the model used.
 | RTX A4000       |                       | 320: ~ 15 ms              |                        |
 | Tesla P40       |                       | 320: ~ 105 ms             |                        |

+### Apple Silicon
+
+With the [Apple Silicon](../configuration/object_detectors.md#apple-silicon-detector) detector Frigate can take advantage of the NPU in M1 and newer Apple Silicon.
+
+:::warning
+
+Apple Silicon can not run within a container, so a ZMQ proxy is utilized to communicate with [the Apple Silicon Frigate detector](https://github.com/frigate-nvr/apple-silicon-detector) which runs on the host. This should add minimal latency when run on the same device.
+
+:::
+
+| Name      | YOLOv9 Inference Time  |
+| --------- | ---------------------- |
+| M3 Pro    | t-320: 6 ms s-320: 8ms |
+| M1        | s-320: 9ms             |
+
 ### ROCm - AMD GPU

-With the [rocm](../configuration/object_detectors.md#amdrocm-gpu-detector) detector Frigate can take advantage of many discrete AMD GPUs.
+With the [ROCm](../configuration/object_detectors.md#amdrocm-gpu-detector) detector Frigate can take advantage of many discrete AMD GPUs.

 | Name      | YOLOv9 Inference Time | YOLO-NAS Inference Time   |
 | --------- | --------------------- | ------------------------- |
 | AMD 780M  | ~ 14 ms               | 320: ~ 25 ms 640: ~ 50 ms |
-| AMD 8700G |                       | 320: ~ 20 ms 640: ~ 40 ms |

 ## Community Supported Detectors

--- a/frigate/detectors/plugins/zmq_ipc.py
+++ b/frigate/detectors/plugins/zmq_ipc.py
@ -0,0 +1,151 @@
+import json
+import logging
+from typing import Any, List
+
+import numpy as np
+import zmq
+from pydantic import Field
+from typing_extensions import Literal
+
+from frigate.detectors.detection_api import DetectionApi
+from frigate.detectors.detector_config import BaseDetectorConfig
+
+logger = logging.getLogger(__name__)
+
+DETECTOR_KEY = "zmq"
+
+
+class ZmqDetectorConfig(BaseDetectorConfig):
+    type: Literal[DETECTOR_KEY]
+    endpoint: str = Field(
+        default="ipc:///tmp/cache/zmq_detector", title="ZMQ IPC endpoint"
+    )
+    request_timeout_ms: int = Field(
+        default=200, title="ZMQ request timeout in milliseconds"
+    )
+    linger_ms: int = Field(default=0, title="ZMQ socket linger in milliseconds")
+
+
+class ZmqIpcDetector(DetectionApi):
+    """
+    ZMQ-based detector plugin using a REQ/REP socket over an IPC endpoint.
+
+    Protocol:
+    - Request is sent as a multipart message:
+        [ header_json_bytes, tensor_bytes ]
+      where header is a JSON object containing:
+        {
+          "shape": List[int],
+          "dtype": str,  # numpy dtype string, e.g. "uint8", "float32"
+        }
+      tensor_bytes are the raw bytes of the numpy array in C-order.
+
+    - Response is expected to be either:
+        a) Multipart [ header_json_bytes, tensor_bytes ] with header specifying
+           shape [20,6] and dtype "float32"; or
+        b) Single frame tensor_bytes of length 20*6*4 bytes (float32).
+
+    On any error or timeout, this detector returns a zero array of shape (20, 6).
+    """
+
+    type_key = DETECTOR_KEY
+
+    def __init__(self, detector_config: ZmqDetectorConfig):
+        super().__init__(detector_config)
+
+        self._context = zmq.Context()
+        self._endpoint = detector_config.endpoint
+        self._request_timeout_ms = detector_config.request_timeout_ms
+        self._linger_ms = detector_config.linger_ms
+        self._socket = None
+        self._create_socket()
+
+        # Preallocate zero result for error paths
+        self._zero_result = np.zeros((20, 6), np.float32)
+
+    def _create_socket(self) -> None:
+        if self._socket is not None:
+            try:
+                self._socket.close(linger=self._linger_ms)
+            except Exception:
+                pass
+        self._socket = self._context.socket(zmq.REQ)
+        # Apply timeouts and linger so calls don't block indefinitely
+        self._socket.setsockopt(zmq.RCVTIMEO, self._request_timeout_ms)
+        self._socket.setsockopt(zmq.SNDTIMEO, self._request_timeout_ms)
+        self._socket.setsockopt(zmq.LINGER, self._linger_ms)
+
+        logger.debug(f"ZMQ detector connecting to {self._endpoint}")
+        self._socket.connect(self._endpoint)
+
+    def _build_header(self, tensor_input: np.ndarray) -> bytes:
+        header: dict[str, Any] = {
+            "shape": list(tensor_input.shape),
+            "dtype": str(tensor_input.dtype.name),
+        }
+        return json.dumps(header).encode("utf-8")
+
+    def _decode_response(self, frames: List[bytes]) -> np.ndarray:
+        try:
+            if len(frames) == 1:
+                # Single-frame raw float32 (20x6)
+                buf = frames[0]
+                if len(buf) != 20 * 6 * 4:
+                    logger.warning(
+                        f"ZMQ detector received unexpected payload size: {len(buf)}"
+                    )
+                    return self._zero_result
+                return np.frombuffer(buf, dtype=np.float32).reshape((20, 6))
+
+            if len(frames) >= 2:
+                header = json.loads(frames[0].decode("utf-8"))
+                shape = tuple(header.get("shape", []))
+                dtype = np.dtype(header.get("dtype", "float32"))
+                return np.frombuffer(frames[1], dtype=dtype).reshape(shape)
+
+            logger.warning("ZMQ detector received empty reply")
+            return self._zero_result
+        except Exception as exc:  # noqa: BLE001
+            logger.error(f"ZMQ detector failed to decode response: {exc}")
+            return self._zero_result
+
+    def detect_raw(self, tensor_input: np.ndarray) -> np.ndarray:
+        try:
+            header_bytes = self._build_header(tensor_input)
+            payload_bytes = memoryview(tensor_input.tobytes(order="C"))
+
+            # Send request
+            self._socket.send_multipart([header_bytes, payload_bytes])
+
+            # Receive reply
+            reply_frames = self._socket.recv_multipart()
+            detections = self._decode_response(reply_frames)
+
+            # Ensure output shape and dtype are exactly as expected
+
+            return detections
+        except zmq.Again:
+            # Timeout
+            logger.debug("ZMQ detector request timed out; resetting socket")
+            try:
+                self._create_socket()
+            except Exception:
+                pass
+            return self._zero_result
+        except zmq.ZMQError as exc:
+            logger.error(f"ZMQ detector ZMQError: {exc}; resetting socket")
+            try:
+                self._create_socket()
+            except Exception:
+                pass
+            return self._zero_result
+        except Exception as exc:  # noqa: BLE001
+            logger.error(f"ZMQ detector unexpected error: {exc}")
+            return self._zero_result
+
+    def __del__(self) -> None:  # pragma: no cover - best-effort cleanup
+        try:
+            if self._socket is not None:
+                self._socket.close(linger=self.detector_config.linger_ms)
+        except Exception:
+            pass