Skip to main content

llama.cpp

Open source software library that performs inference on various large language models such as Llama.

The library uses the GGUF (GGML Universal File) binary file format to store tensors and model metadata.

https://github.com/ggerganov/llama.cpp

📄️ Get started

Simple steps to get started with llama.cpp.

📄️ llama-cpp-python

Python bindings for llama.cpp

📄️ llama-server

CLI for creating LLM servers.

📄️ Troubleshooting

Frequently occurring problems and solutions.