llama.cpp

warning

This document has been translated using machine translation without human review.

llama.cpp is an open-source library for interacting with large language models (LLMs), such as Llama.

The library uses the GGUF (GGML Universal File) binary file format to store model tensors and metadata.

📄️ llama.cpp: Quick Start

Simple steps to get started with llama.cpp.

Python bindings for llama.cpp.

CLI for spinning up LLM servers.

Common problems and solutions.