Get started
-
Download the latest release: https://github.com/ggerganov/llama.cpp/releases For example, for Windows with GPU: llama-b4458-bin-win-cuda-cu12.4-x64.zip + cudart-llama-bin-win-cu12.4-x64.zip.
-
Extract files. If you downloaded cudart, place the dll files in the llama.cpp folder.
-
Find and download the guff files of LLM(s): https://huggingface.co/models?search=gguf.
For example, https://huggingface.co/lmstudio-community/Meta-Llama-3-8B-Instruct-GGUF/tree/main: -
Launch command prompt and run the following command:
llama-cli -m model.gguf -p "You are a helpful assistant" -cnv
-
Enjoy!