Commit Graph

5 Commits

Author SHA1 Message Date
Nicolas Mowen
68f806bb61
Cleanup onnx detector (#20128)
* Cleanup onnx detector

* Fix

* Fix classification cropping

* Deprioritize openvino

* Send model type

* Use model type to decide if model can use full optimization

* Clenanup

* Cleanup
2025-09-18 15:12:09 -06:00
Nicolas Mowen
81d7c47129
Optimize OpenVINO and ONNX Model Runners (#20063)
* Use re-usable inference request to reduce CPU usage

* Share tensor

* Don't count performance

* Create openvino runner class

* Break apart onnx runner

* Add specific note about inability to use CUDA graphs for some models

* Adjust rknn to use RKNNRunner

* Use optimized runner

* Add support for non-complex models for CudaExecutionProvider

* Use core mask for rknn

* Correctly handle cuda input

* Cleanup

* Sort imports
2025-09-14 06:22:22 -06:00
baudneo
33f3ea3b59
Enrichments: Allow targeting a specific GPU ID (#19342) 2025-08-18 17:43:53 -06:00
idxlics
976863518b
Use HF_ENDPOINT env instead of hardcoding https://huggingface.co (#18036)
* Update jina_v1_embedding.py

* Update jina_v2_embedding.py
2025-05-04 19:38:17 -05:00
Josh Hawkins
d0e9bcbfdc
Add ability to use Jina CLIP V2 for semantic search (#16826)
* add wheels

* move extra index url to bottom

* config model option

* add postprocess

* fix config

* jina v2 embedding class

* use jina v2 in embeddings

* fix ov inference

* frontend

* update reference config

* revert device

* fix truncation

* return np tensors

* use correct embeddings from inference

* manual preprocess

* clean up

* docs

* lower batch size for v2 only

* docs clarity

* wording
2025-02-26 07:58:25 -07:00