* Use re-usable inference request to reduce CPU usage
* Share tensor
* Don't count performance
* Create openvino runner class
* Break apart onnx runner
* Add specific note about inability to use CUDA graphs for some models
* Adjust rknn to use RKNNRunner
* Use optimized runner
* Add support for non-complex models for CudaExecutionProvider
* Use core mask for rknn
* Correctly handle cuda input
* Cleanup
* Sort imports
* Move onnx runner
* Build out base embedding
* Convert text embedding to separate class
* Move image embedding to separate
* Move LPR to separate class
* Remove mono embedding
* Simplify model downloading
* Reorganize jina v1 embeddings
* Cleanup
* Cleanup for review