You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We already support GPU inference and embedding at Kirin project. So, we should also support GPU in this project. Furthermore, please keep in mind what I mentioned in last meeting. We want CPU/GPU offload not the CPU or GPU separately mode.
Contact Details(optional)
No response
What feature are you requesting?
We already support GPU inference and embedding at Kirin project. So, we should also support GPU in this project. Furthermore, please keep in mind what I mentioned in last meeting. We want CPU/GPU offload not the CPU or GPU separately mode.
https://medium.com/@aisuko/quantization-tech-of-llms-gguf-0342a08f082c
The text was updated successfully, but these errors were encountered: