Releases: ggml-org/llama.cpp
Releases · ggml-org/llama.cpp
b5361
b5360
opencl: remove unnecessary assert for `add` (#13257)
b5359
clip : cap max image size 1024 for qwen vl model (#13478)
b5358
llama/ggml: add LLM training support (#10544) * llama/ggml: add LLM training support more compact progress bar llama_save_model_to_file llama_opt_param_filter ggml_graph_dup force_grads refactor ggml_opt, fix test-opt * remove logits_all * refactor CUDA implementation for ACC * reset graph at beginning of opt period
b5357
context : fix state io for memory-less contexts (#13470) ggml-ci
b5356
server : allow content to be null in oaicompat_completion_params_pars…
b5355
llama-bench : accept ranges for integer parameters (#13410)
b5354
ggml-cpu: Integrate fp32=bf16xbf16 SME KleidiAI kernel (#13053) * ggml-cpu: Integrate fp32=bf16xbf16 SME KleidiAI kernel Signed-off-by: Dan Johansson <dan.johansson@arm.com> * * code review fixes Signed-off-by: Dan Johansson <dan.johansson@arm.com> * * adds a comment that clarifies barrier usage Signed-off-by: Dan Johansson <dan.johansson@arm.com> --------- Signed-off-by: Dan Johansson <dan.johansson@arm.com> Co-authored-by: Charles Xu <charles.xu@arm.com>
b5353
CUDA: fix misaligned synchronization in FA (#13469)
b5352
ggml : add mrope kernel for metal (#13457)