mirror of
https://github.com/blakeblackshear/frigate.git
synced 2025-08-27 13:47:50 +02:00
Update GenAI docs for new review summaries feature (#19493)
* Remove old genai docs * Separate existing genai docs to separate sections * Add docs for genai features * Update reference config * Update link * Move to bottom
This commit is contained in:
parent
7740b08bd9
commit
6671984e5a
128
docs/docs/configuration/genai/config.md
Normal file
128
docs/docs/configuration/genai/config.md
Normal file
@ -0,0 +1,128 @@
|
||||
---
|
||||
id: genai_config
|
||||
title: Configuring Generative AI
|
||||
---
|
||||
|
||||
## Configuration
|
||||
|
||||
A Generative AI provider can be configured in the global config, which will make the Generative AI features available for use. There are currently 3 native providers available to integrate with Frigate. Other providers that support the OpenAI standard API can also be used. See the OpenAI section below.
|
||||
|
||||
To use Generative AI, you must define a single provider at the global level of your Frigate configuration. If the provider you choose requires an API key, you may either directly paste it in your configuration, or store it in an environment variable prefixed with `FRIGATE_`.
|
||||
|
||||
## Ollama
|
||||
|
||||
:::warning
|
||||
|
||||
Using Ollama on CPU is not recommended, high inference times make using Generative AI impractical.
|
||||
|
||||
:::
|
||||
|
||||
[Ollama](https://ollama.com/) allows you to self-host large language models and keep everything running locally. It provides a nice API over [llama.cpp](https://github.com/ggerganov/llama.cpp). It is highly recommended to host this server on a machine with an Nvidia graphics card, or on a Apple silicon Mac for best performance.
|
||||
|
||||
Most of the 7b parameter 4-bit vision models will fit inside 8GB of VRAM. There is also a [Docker container](https://hub.docker.com/r/ollama/ollama) available.
|
||||
|
||||
Parallel requests also come with some caveats. You will need to set `OLLAMA_NUM_PARALLEL=1` and choose a `OLLAMA_MAX_QUEUE` and `OLLAMA_MAX_LOADED_MODELS` values that are appropriate for your hardware and preferences. See the [Ollama documentation](https://github.com/ollama/ollama/blob/main/docs/faq.md#how-does-ollama-handle-concurrent-requests).
|
||||
|
||||
### Supported Models
|
||||
|
||||
You must use a vision capable model with Frigate. Current model variants can be found [in their model library](https://ollama.com/library). Note that Frigate will not automatically download the model you specify in your config, Ollama will try to download the model but it may take longer than the timeout, it is recommended to pull the model beforehand by running `ollama pull your_model` on your Ollama server/Docker container. Note that the model specified in Frigate's config must match the downloaded model tag.
|
||||
|
||||
The following models are recommended:
|
||||
|
||||
| Model | Size | Recommended Features |
|
||||
| ----------------- | ------ | -------------------- |
|
||||
| `minicpm-v:8b` | 5.5 GB | Review Summary |
|
||||
| `qwen2.5vl:3b` | 3.2 GB | Review Summary |
|
||||
| `gemma3:4b` | 3.3 GB | All Features |
|
||||
| `llava-phi3:3.8b` | 2.9 GB | All Features |
|
||||
|
||||
:::note
|
||||
|
||||
You should have at least 8 GB of RAM available (or VRAM if running on GPU) to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models.
|
||||
|
||||
:::
|
||||
|
||||
### Configuration
|
||||
|
||||
```yaml
|
||||
genai:
|
||||
provider: ollama
|
||||
base_url: http://localhost:11434
|
||||
model: minicpm-v:8b
|
||||
provider_options: # other Ollama client options can be defined
|
||||
keep_alive: -1
|
||||
```
|
||||
|
||||
## Google Gemini
|
||||
|
||||
Google Gemini has a free tier allowing [15 queries per minute](https://ai.google.dev/pricing) to the API, which is more than sufficient for standard Frigate usage.
|
||||
|
||||
### Supported Models
|
||||
|
||||
You must use a vision capable model with Frigate. Current model variants can be found [in their documentation](https://ai.google.dev/gemini-api/docs/models/gemini). At the time of writing, this includes `gemini-1.5-pro` and `gemini-1.5-flash`.
|
||||
|
||||
### Get API Key
|
||||
|
||||
To start using Gemini, you must first get an API key from [Google AI Studio](https://aistudio.google.com).
|
||||
|
||||
1. Accept the Terms of Service
|
||||
2. Click "Get API Key" from the right hand navigation
|
||||
3. Click "Create API key in new project"
|
||||
4. Copy the API key for use in your config
|
||||
|
||||
### Configuration
|
||||
|
||||
```yaml
|
||||
genai:
|
||||
provider: gemini
|
||||
api_key: "{FRIGATE_GEMINI_API_KEY}"
|
||||
model: gemini-1.5-flash
|
||||
```
|
||||
|
||||
## OpenAI
|
||||
|
||||
OpenAI does not have a free tier for their API. With the release of gpt-4o, pricing has been reduced and each generation should cost fractions of a cent if you choose to go this route.
|
||||
|
||||
### Supported Models
|
||||
|
||||
You must use a vision capable model with Frigate. Current model variants can be found [in their documentation](https://platform.openai.com/docs/models). At the time of writing, this includes `gpt-4o` and `gpt-4-turbo`.
|
||||
|
||||
### Get API Key
|
||||
|
||||
To start using OpenAI, you must first [create an API key](https://platform.openai.com/api-keys) and [configure billing](https://platform.openai.com/settings/organization/billing/overview).
|
||||
|
||||
### Configuration
|
||||
|
||||
```yaml
|
||||
genai:
|
||||
provider: openai
|
||||
api_key: "{FRIGATE_OPENAI_API_KEY}"
|
||||
model: gpt-4o
|
||||
```
|
||||
|
||||
:::note
|
||||
|
||||
To use a different OpenAI-compatible API endpoint, set the `OPENAI_BASE_URL` environment variable to your provider's API URL.
|
||||
|
||||
:::
|
||||
|
||||
## Azure OpenAI
|
||||
|
||||
Microsoft offers several vision models through Azure OpenAI. A subscription is required.
|
||||
|
||||
### Supported Models
|
||||
|
||||
You must use a vision capable model with Frigate. Current model variants can be found [in their documentation](https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models). At the time of writing, this includes `gpt-4o` and `gpt-4-turbo`.
|
||||
|
||||
### Create Resource and Get API Key
|
||||
|
||||
To start using Azure OpenAI, you must first [create a resource](https://learn.microsoft.com/azure/cognitive-services/openai/how-to/create-resource?pivots=web-portal#create-a-resource). You'll need your API key and resource URL, which must include the `api-version` parameter (see the example below). The model field is not required in your configuration as the model is part of the deployment name you chose when deploying the resource.
|
||||
|
||||
### Configuration
|
||||
|
||||
```yaml
|
||||
genai:
|
||||
provider: azure_openai
|
||||
base_url: https://example-endpoint.openai.azure.com/openai/deployments/gpt-4o/chat/completions?api-version=2023-03-15-preview
|
||||
api_key: "{FRIGATE_OPENAI_API_KEY}"
|
||||
```
|
77
docs/docs/configuration/genai/objects.md
Normal file
77
docs/docs/configuration/genai/objects.md
Normal file
@ -0,0 +1,77 @@
|
||||
---
|
||||
id: genai_objects
|
||||
title: Object Descriptions
|
||||
---
|
||||
|
||||
Generative AI can be used to automatically generate descriptive text based on the thumbnails of your tracked objects. This helps with [Semantic Search](/configuration/semantic_search) in Frigate to provide more context about your tracked objects. Descriptions are accessed via the _Explore_ view in the Frigate UI by clicking on a tracked object's thumbnail.
|
||||
|
||||
Requests for a description are sent off automatically to your AI provider at the end of the tracked object's lifecycle, or can optionally be sent earlier after a number of significantly changed frames, for example in use in more real-time notifications. Descriptions can also be regenerated manually via the Frigate UI. Note that if you are manually entering a description for tracked objects prior to its end, this will be overwritten by the generated response.
|
||||
|
||||
By default, descriptions will be generated for all tracked objects and all zones. But you can also optionally specify `objects` and `required_zones` to only generate descriptions for certain tracked objects or zones.
|
||||
|
||||
Optionally, you can generate the description using a snapshot (if enabled) by setting `use_snapshot` to `True`. By default, this is set to `False`, which sends the uncompressed images from the `detect` stream collected over the object's lifetime to the model. Once the object lifecycle ends, only a single compressed and cropped thumbnail is saved with the tracked object. Using a snapshot might be useful when you want to _regenerate_ a tracked object's description as it will provide the AI with a higher-quality image (typically downscaled by the AI itself) than the cropped/compressed thumbnail. Using a snapshot otherwise has a trade-off in that only a single image is sent to your provider, which will limit the model's ability to determine object movement or direction.
|
||||
|
||||
Generative AI object descriptions can also be toggled dynamically for a camera via MQTT with the topic `frigate/<camera_name>/object_descriptions/set`. See the [MQTT documentation](/integrations/mqtt/#frigatecamera_nameobjectdescriptionsset).
|
||||
|
||||
## Usage and Best Practices
|
||||
|
||||
Frigate's thumbnail search excels at identifying specific details about tracked objects – for example, using an "image caption" approach to find a "person wearing a yellow vest," "a white dog running across the lawn," or "a red car on a residential street." To enhance this further, Frigate’s default prompts are designed to ask your AI provider about the intent behind the object's actions, rather than just describing its appearance.
|
||||
|
||||
While generating simple descriptions of detected objects is useful, understanding intent provides a deeper layer of insight. Instead of just recognizing "what" is in a scene, Frigate’s default prompts aim to infer "why" it might be there or "what" it could do next. Descriptions tell you what’s happening, but intent gives context. For instance, a person walking toward a door might seem like a visitor, but if they’re moving quickly after hours, you can infer a potential break-in attempt. Detecting a person loitering near a door at night can trigger an alert sooner than simply noting "a person standing by the door," helping you respond based on the situation’s context.
|
||||
|
||||
## Custom Prompts
|
||||
|
||||
Frigate sends multiple frames from the tracked object along with a prompt to your Generative AI provider asking it to generate a description. The default prompt is as follows:
|
||||
|
||||
```
|
||||
Analyze the sequence of images containing the {label}. Focus on the likely intent or behavior of the {label} based on its actions and movement, rather than describing its appearance or the surroundings. Consider what the {label} is doing, why, and what it might do next.
|
||||
```
|
||||
|
||||
:::tip
|
||||
|
||||
Prompts can use variable replacements `{label}`, `{sub_label}`, and `{camera}` to substitute information from the tracked object as part of the prompt.
|
||||
|
||||
:::
|
||||
|
||||
You are also able to define custom prompts in your configuration.
|
||||
|
||||
```yaml
|
||||
genai:
|
||||
provider: ollama
|
||||
base_url: http://localhost:11434
|
||||
model: llava
|
||||
|
||||
objects:
|
||||
prompt: "Analyze the {label} in these images from the {camera} security camera. Focus on the actions, behavior, and potential intent of the {label}, rather than just describing its appearance."
|
||||
object_prompts:
|
||||
person: "Examine the main person in these images. What are they doing and what might their actions suggest about their intent (e.g., approaching a door, leaving an area, standing still)? Do not describe the surroundings or static details."
|
||||
car: "Observe the primary vehicle in these images. Focus on its movement, direction, or purpose (e.g., parking, approaching, circling). If it's a delivery vehicle, mention the company."
|
||||
```
|
||||
|
||||
Prompts can also be overridden at the camera level to provide a more detailed prompt to the model about your specific camera, if you desire.
|
||||
|
||||
```yaml
|
||||
cameras:
|
||||
front_door:
|
||||
objects:
|
||||
genai:
|
||||
enabled: True
|
||||
use_snapshot: True
|
||||
prompt: "Analyze the {label} in these images from the {camera} security camera at the front door. Focus on the actions and potential intent of the {label}."
|
||||
object_prompts:
|
||||
person: "Examine the person in these images. What are they doing, and how might their actions suggest their purpose (e.g., delivering something, approaching, leaving)? If they are carrying or interacting with a package, include details about its source or destination."
|
||||
cat: "Observe the cat in these images. Focus on its movement and intent (e.g., wandering, hunting, interacting with objects). If the cat is near the flower pots or engaging in any specific actions, mention it."
|
||||
objects:
|
||||
- person
|
||||
- cat
|
||||
required_zones:
|
||||
- steps
|
||||
```
|
||||
|
||||
### Experiment with prompts
|
||||
|
||||
Many providers also have a public facing chat interface for their models. Download a couple of different thumbnails or snapshots from Frigate and try new things in the playground to get descriptions to your liking before updating the prompt in Frigate.
|
||||
|
||||
- OpenAI - [ChatGPT](https://chatgpt.com)
|
||||
- Gemini - [Google AI Studio](https://aistudio.google.com)
|
||||
- Ollama - [Open WebUI](https://docs.openwebui.com/)
|
44
docs/docs/configuration/genai/review_summaries.md
Normal file
44
docs/docs/configuration/genai/review_summaries.md
Normal file
@ -0,0 +1,44 @@
|
||||
---
|
||||
id: genai_review
|
||||
title: Review Summaries
|
||||
---
|
||||
|
||||
Generative AI can be used to automatically generate structured summaries of review items. These summaries will show up in Frigate's native notifications as well as in the UI. Generative AI can also be used to take a collection of summaries over a period of time and provide a report, which may be useful to get a quick report of everything that happened while out for some amount of time.
|
||||
|
||||
Requests for a summary are requested automatically to your AI provider for alert review items when the activity has ended, they can also be optionally enabled for detections as well.
|
||||
|
||||
Generative AI review summaries can also be toggled dynamically for a camera via MQTT with the topic `frigate/<camera_name>/review_descriptions/set`. See the [MQTT documentation](/integrations/mqtt/#frigatecamera_namereviewdescriptionsset).
|
||||
|
||||
## Review Summary Usage and Best Practices
|
||||
|
||||
Review summaries provide structured JSON responses that are saved for each review item:
|
||||
|
||||
```
|
||||
- `scene` (string): A full description including setting, entities, actions, and any plausible supported inferences.
|
||||
- `confidence` (float): 0-1 confidence in the analysis.
|
||||
- `other_concerns` (list): List of user-defined concerns that may need additional investigation.
|
||||
- `potential_threat_level` (integer): 0, 1, or 2 as defined below.
|
||||
|
||||
Threat-level definitions:
|
||||
- 0 — Typical or expected activity for this location/time (includes residents, guests, or known animals engaged in normal activities, even if they glance around or scan surroundings).
|
||||
- 1 — Unusual or suspicious activity: At least one security-relevant behavior is present **and not explainable by a normal residential activity**.
|
||||
- 2 — Active or immediate threat: Breaking in, vandalism, aggression, weapon display.
|
||||
```
|
||||
|
||||
This will show in the UI as a list of concerns that each review item has along with the general description.
|
||||
|
||||
### Additional Concerns
|
||||
|
||||
Along with the concern of suspicious activity or immediate threat, you may have concerns such as animals in your garden or a gate being left open. These concerns can be configured so that the review summaries will make note of them if the activity requires additional review. For example:
|
||||
|
||||
```yaml
|
||||
review:
|
||||
genai:
|
||||
enabled: true
|
||||
additional_concerns:
|
||||
- animals in the garden
|
||||
```
|
||||
|
||||
## Review Reports
|
||||
|
||||
Along with individual review item summaries, Generative AI provides the ability to request a report of a given time period. For example, you can get a daily report while on a vacation of any suspicious activity or other concerns that may require review.
|
@ -398,6 +398,19 @@ review:
|
||||
# should be configured at the camera level.
|
||||
required_zones:
|
||||
- driveway
|
||||
# Optional: GenAI Review Summary Configuration
|
||||
genai:
|
||||
# Optional: Enable the GenAI review summary feature (default: shown below)
|
||||
enabled: False
|
||||
# Optional: Enable GenAI review summaries for alerts (default: shown below)
|
||||
alerts: True
|
||||
# Optional: Enable GenAI review summaries for detections (default: shown below)
|
||||
detections: False
|
||||
# Optional: Additional concerns that the GenAI should make note of (default: None)
|
||||
additional_concerns:
|
||||
- Animals in the garden
|
||||
# Optional: Preferred response language (default: English)
|
||||
preferred_language: English
|
||||
|
||||
# Optional: Motion configuration
|
||||
# NOTE: Can be overridden at the camera level
|
||||
@ -639,6 +652,9 @@ genai:
|
||||
base_url: http://localhost::11434
|
||||
# Required if gemini or openai
|
||||
api_key: "{FRIGATE_GENAI_API_KEY}"
|
||||
# Optional additional args to pass to the GenAI Provider (default: None)
|
||||
provider_options:
|
||||
keep_alive: -1
|
||||
|
||||
# Optional: Configuration for audio transcription
|
||||
# NOTE: only the enabled option can be overridden at the camera level
|
||||
|
@ -39,7 +39,7 @@ If you are enabling Semantic Search for the first time, be advised that Frigate
|
||||
|
||||
The [V1 model from Jina](https://huggingface.co/jinaai/jina-clip-v1) has a vision model which is able to embed both images and text into the same vector space, which allows `image -> image` and `text -> image` similarity searches. Frigate uses this model on tracked objects to encode the thumbnail image and store it in the database. When searching for tracked objects via text in the search box, Frigate will perform a `text -> image` similarity search against this embedding. When clicking "Find Similar" in the tracked object detail pane, Frigate will perform an `image -> image` similarity search to retrieve the closest matching thumbnails.
|
||||
|
||||
The V1 text model is used to embed tracked object descriptions and perform searches against them. Descriptions can be created, viewed, and modified on the Explore page when clicking on thumbnail of a tracked object. See [the Generative AI docs](/configuration/genai.md) for more information on how to automatically generate tracked object descriptions.
|
||||
The V1 text model is used to embed tracked object descriptions and perform searches against them. Descriptions can be created, viewed, and modified on the Explore page when clicking on thumbnail of a tracked object. See [the object description docs](/configuration/genai/objects.md) for more information on how to automatically generate tracked object descriptions.
|
||||
|
||||
Differently weighted versions of the Jina models are available and can be selected by setting the `model_size` config option as `small` or `large`:
|
||||
|
||||
|
@ -37,10 +37,23 @@ const sidebars: SidebarsConfig = {
|
||||
],
|
||||
Enrichments: [
|
||||
"configuration/semantic_search",
|
||||
"configuration/genai",
|
||||
"configuration/face_recognition",
|
||||
"configuration/license_plate_recognition",
|
||||
"configuration/bird_classification",
|
||||
{
|
||||
type: "category",
|
||||
label: "Generative AI",
|
||||
link: {
|
||||
type: "generated-index",
|
||||
title: "Generative AI",
|
||||
description: "Generative AI Features",
|
||||
},
|
||||
items: [
|
||||
"configuration/genai/genai_config",
|
||||
"configuration/genai/genai_review",
|
||||
"configuration/genai/genai_objects",
|
||||
],
|
||||
},
|
||||
],
|
||||
Cameras: [
|
||||
"configuration/cameras",
|
||||
|
Loading…
Reference in New Issue
Block a user