Releases · ggml-org/llama.cpp

12 May 22:45

cf0a43b

b5361 Latest

Latest

llama-bench : add defrag-thold, check for invalid ranges (#13487)

Assets 19

cudart-llama-bin-win-cuda11.7-x64.zip

303 MB 2025-05-12T22:45:36Z
cudart-llama-bin-win-cuda12.4-x64.zip

373 MB 2025-05-12T22:45:46Z
llama-b5361-bin-macos-arm64.zip

9.87 MB 2025-05-12T22:45:57Z
llama-b5361-bin-macos-x64.zip

24.1 MB 2025-05-12T22:45:58Z
llama-b5361-bin-ubuntu-arm64.zip

10.5 MB 2025-05-12T22:45:59Z
llama-b5361-bin-ubuntu-vulkan-x64.zip

18.7 MB 2025-05-12T22:46:00Z
llama-b5361-bin-ubuntu-x64.zip

11.1 MB 2025-05-12T22:46:01Z
llama-b5361-bin-win-cpu-arm64.zip

11.5 MB 2025-05-12T22:46:02Z
llama-b5361-bin-win-cpu-x64.zip

12.7 MB 2025-05-12T22:46:02Z
llama-b5361-bin-win-cuda11.7-x64.zip

127 MB 2025-05-12T22:46:03Z
Source code (zip)

2025-05-12T22:31:37Z
Source code (tar.gz)

2025-05-12T22:31:37Z

12 May 20:49

github-actions

b5360

f0d46ef

b5360

opencl: remove unnecessary assert for `add` (#13257)

Assets 20

12 May 15:27

github-actions

b5359

de4c07f

b5359

clip : cap max image size 1024 for qwen vl model (#13478)

Assets 20

12 May 15:01

github-actions

b5358

10d2af0

b5358

llama/ggml: add LLM training support (#10544)

* llama/ggml: add LLM training support

more compact progress bar

llama_save_model_to_file

llama_opt_param_filter

ggml_graph_dup force_grads

refactor ggml_opt, fix test-opt

* remove logits_all

* refactor CUDA implementation for ACC

* reset graph at beginning of opt period

Assets 20

12 May 14:14

github-actions

b5357

064cc59

b5357

context : fix state io for memory-less contexts (#13470)

ggml-ci

Assets 20

12 May 14:05

github-actions

b5356

91159ee

b5356

server : allow content to be null in oaicompat_completion_params_pars…

Assets 20

12 May 12:50

github-actions

b5355

22cdab3

b5355

llama-bench : accept ranges for integer parameters (#13410)

Assets 20

12 May 12:24

github-actions

b5354

a71a407

b5354

ggml-cpu: Integrate fp32=bf16xbf16 SME KleidiAI kernel (#13053)

* ggml-cpu: Integrate fp32=bf16xbf16 SME KleidiAI kernel

Signed-off-by: Dan Johansson <dan.johansson@arm.com>

* * code review fixes

Signed-off-by: Dan Johansson <dan.johansson@arm.com>

* * adds a comment that clarifies barrier usage

Signed-off-by: Dan Johansson <dan.johansson@arm.com>

---------

Signed-off-by: Dan Johansson <dan.johansson@arm.com>
Co-authored-by: Charles Xu <charles.xu@arm.com>

Assets 20

12 May 10:10

github-actions

b5353

95e1888

b5353

CUDA: fix misaligned synchronization in FA (#13469)

Assets 20

12 May 10:07

github-actions

b5352

df84919

b5352

ggml : add mrope kernel for metal (#13457)

Assets 20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: ggml-org/llama.cpp

b5361

b5360

b5359

b5358

b5357

b5356

b5355

b5354

b5353

b5352