Miscellaneous Fixes (#21024)

* fix wording in reference config * spacing tweaks * make live view settings drawer scrollable * clarify audio transcription docs * change audio transcription icon to activity indicator when transcription is in progress the backend doesn't implement any kind of queueing for speech event transcription * tracking details tweaks - Add attribute box overlay and area - Add score - Throttle swr revalidation during video component rerendering * add mse codecs to console debug on errors * add camera name
2026-04-28 23:06:13 +02:00 · 2025-11-24 07:34:56 -06:00
parent 2d8b6c8301
commit aa8b423b68
25 changed files with 592 additions and 390 deletions
--- a/docs/docs/configuration/audio_detectors.md
+++ b/docs/docs/configuration/audio_detectors.md
@@ -144,4 +144,10 @@ In order to use transcription and translation for past events, you must enable a

 The transcribed/translated speech will appear in the description box in the Tracked Object Details pane. If Semantic Search is enabled, embeddings are generated for the transcription text and are fully searchable using the description search type.

-Recorded `speech` events will always use a `whisper` model, regardless of the `model_size` config setting. Without a GPU, generating transcriptions for longer `speech` events may take a fair amount of time, so be patient.
+:::note
+
+Only one `speech` event may be transcribed at a time. Frigate does not automatically transcribe `speech` events or implement a queue for long-running transcription model inference.
+
+:::
+
+Recorded `speech` events will always use a `whisper` model, regardless of the `model_size` config setting. Without a supported Nvidia GPU, generating transcriptions for longer `speech` events may take a fair amount of time, so be patient.
--- a/docs/docs/configuration/reference.md
+++ b/docs/docs/configuration/reference.md
@@ -700,11 +700,11 @@ genai:
 # Optional: Configuration for audio transcription
 # NOTE: only the enabled option can be overridden at the camera level
 audio_transcription:
-  # Optional: Enable license plate recognition (default: shown below)
+  # Optional: Enable live and speech event audio transcription (default: shown below)
  enabled: False
-  # Optional: The device to run the models on (default: shown below)
+  # Optional: The device to run the models on for live transcription. (default: shown below)
  device: CPU
-  # Optional: Set the model size used for transcription. (default: shown below)
+  # Optional: Set the model size used for live transcription. (default: shown below)
  model_size: small
  # Optional: Set the language used for transcription translation. (default: shown below)
  # List of language codes: https://github.com/openai/whisper/blob/main/whisper/tokenizer.py#L10