Skip to content

Releases: ggml-org/llama.cpp

b5361

12 May 22:45
cf0a43b
Compare
Choose a tag to compare
llama-bench : add defrag-thold, check for invalid ranges (#13487)

b5360

12 May 20:49
f0d46ef
Compare
Choose a tag to compare
opencl: remove unnecessary assert for `add` (#13257)

b5359

12 May 15:27
de4c07f
Compare
Choose a tag to compare
clip : cap max image size 1024 for qwen vl model (#13478)

b5358

12 May 15:01
10d2af0
Compare
Choose a tag to compare
llama/ggml: add LLM training support (#10544)

* llama/ggml: add LLM training support

more compact progress bar

llama_save_model_to_file

llama_opt_param_filter

ggml_graph_dup force_grads

refactor ggml_opt, fix test-opt

* remove logits_all

* refactor CUDA implementation for ACC

* reset graph at beginning of opt period

b5357

12 May 14:14
064cc59
Compare
Choose a tag to compare
context : fix state io for memory-less contexts (#13470)

ggml-ci

b5356

12 May 14:05
91159ee
Compare
Choose a tag to compare
server : allow content to be null in oaicompat_completion_params_pars…

b5355

12 May 12:50
22cdab3
Compare
Choose a tag to compare
llama-bench : accept ranges for integer parameters (#13410)

b5354

12 May 12:24
a71a407
Compare
Choose a tag to compare
ggml-cpu: Integrate fp32=bf16xbf16 SME KleidiAI kernel (#13053)

* ggml-cpu: Integrate fp32=bf16xbf16 SME KleidiAI kernel

Signed-off-by: Dan Johansson <dan.johansson@arm.com>

* * code review fixes

Signed-off-by: Dan Johansson <dan.johansson@arm.com>

* * adds a comment that clarifies barrier usage

Signed-off-by: Dan Johansson <dan.johansson@arm.com>

---------

Signed-off-by: Dan Johansson <dan.johansson@arm.com>
Co-authored-by: Charles Xu <charles.xu@arm.com>

b5353

12 May 10:10
95e1888
Compare
Choose a tag to compare
CUDA: fix misaligned synchronization in FA (#13469)

b5352

12 May 10:07
df84919
Compare
Choose a tag to compare
ggml : add mrope kernel for metal (#13457)