llama.cpp
Open source software library that performs inference on various large language models such as Llama.
The library uses the GGUF (GGML Universal File) binary file format to store tensors and model metadata.
https://github.com/ggerganov/llama.cpp
📄️ Get started
Simple steps to get started with llama.cpp.
📄️ llama-cpp-python
Python bindings for llama.cpp
📄️ llama-server
CLI for creating LLM servers.
📄️ Troubleshooting
Frequently occurring problems and solutions.