Commit Graph

4 Commits

Author SHA1 Message Date
Nicolas Mowen
81d7c47129
Optimize OpenVINO and ONNX Model Runners (#20063)
* Use re-usable inference request to reduce CPU usage

* Share tensor

* Don't count performance

* Create openvino runner class

* Break apart onnx runner

* Add specific note about inability to use CUDA graphs for some models

* Adjust rknn to use RKNNRunner

* Use optimized runner

* Add support for non-complex models for CudaExecutionProvider

* Use core mask for rknn

* Correctly handle cuda input

* Cleanup

* Sort imports
2025-09-14 06:22:22 -06:00
baudneo
33f3ea3b59
Enrichments: Allow targeting a specific GPU ID (#19342) 2025-08-18 17:43:53 -06:00
idxlics
976863518b
Use HF_ENDPOINT env instead of hardcoding https://huggingface.co (#18036)
* Update jina_v1_embedding.py

* Update jina_v2_embedding.py
2025-05-04 19:38:17 -05:00
Nicolas Mowen
c736b1dae5
Refactor ONNX embedding class to use a base class and type-specific classes (#16703)
* Move onnx runner

* Build out base embedding

* Convert text embedding to separate class

* Move image embedding to separate

* Move LPR to separate class

* Remove mono embedding

* Simplify model downloading

* Reorganize jina v1 embeddings

* Cleanup

* Cleanup for review
2025-02-20 10:17:07 -06:00