Skip to content

Mac X64 broken (not really surprised) #3116

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
peardox opened this issue May 4, 2025 · 0 comments
Open

Mac X64 broken (not really surprised) #3116

peardox opened this issue May 4, 2025 · 0 comments

Comments

@peardox
Copy link

peardox commented May 4, 2025

Not that bothered really but FYI

Really old Macbook Pro (Later 2013)

Built as

cmake -B build -DBINDINGS_FLAT=ON -DBUILD_SHARED_LIBS=ON -DGGML_METAL=ON -DGGML_BACKEND_DL=ON -DGGML_CPU_ALL_VARIANTS=ON

Note : Using an unmerged PR

./build/bin/whisper-bench -m ~/models/ggml-base.en.bin
+++ BINDINGS_FLAT +++
load_backend: loaded BLAS backend from /Users/simon/src/whisper.cpp/./build/bin/libggml-blas.so
load_backend: loaded Metal backend from /Users/simon/src/whisper.cpp/./build/bin/libggml-metal.so
load_backend: loaded CPU backend from /Users/simon/src/whisper.cpp/./build/bin/libggml-cpu-haswell.so
whisper_init_from_file_with_params_no_state: loading model from '/Users/simon/models/ggml-base.en.bin'
whisper_init_with_params_no_state: use gpu = 1
whisper_init_with_params_no_state: flash attn = 0
whisper_init_with_params_no_state: gpu_device = 0
whisper_init_with_params_no_state: dtw = 0
whisper_init_with_params_no_state: devices = 3
whisper_init_with_params_no_state: backends = 3
whisper_model_load: loading model
whisper_model_load: n_vocab = 51864
whisper_model_load: n_audio_ctx = 1500
whisper_model_load: n_audio_state = 512
whisper_model_load: n_audio_head = 8
whisper_model_load: n_audio_layer = 6
whisper_model_load: n_text_ctx = 448
whisper_model_load: n_text_state = 512
whisper_model_load: n_text_head = 8
whisper_model_load: n_text_layer = 6
whisper_model_load: n_mels = 80
whisper_model_load: ftype = 1
whisper_model_load: qntvr = 0
whisper_model_load: type = 2 (base)
whisper_model_load: adding 1607 extra tokens
whisper_model_load: n_langs = 99
whisper_model_load: Metal total size = 147.37 MB
whisper_model_load: model size = 147.37 MB
whisper_backend_init_gpu: using Metal backend
ggml_metal_init: allocating
ggml_metal_init: found device: Intel Iris Graphics
ggml_metal_init: picking default device: Intel Iris Graphics
ggml_metal_load_library: using embedded metal library
ggml_metal_init: GPU name: Intel Iris Graphics
ggml_metal_init: GPU family: MTLGPUFamilyCommon2 (3002)
ggml_metal_init: simdgroup reduction = false
ggml_metal_init: simdgroup matrix mul. = false
ggml_metal_init: has residency sets = false
ggml_metal_init: has bfloat = false
ggml_metal_init: use bfloat = false
ggml_metal_init: hasUnifiedMemory = true
ggml_metal_init: recommendedMaxWorkingSetSize = 1610.61 MB
ggml_metal_init: skipping kernel_soft_max_f16 (not supported)
ggml_metal_init: skipping kernel_soft_max_f16_4 (not supported)
ggml_metal_init: skipping kernel_soft_max_f32 (not supported)
ggml_metal_init: skipping kernel_soft_max_f32_4 (not supported)
ggml_metal_init: skipping kernel_get_rows_bf16 (not supported)
ggml_metal_init: skipping kernel_rms_norm (not supported)
ggml_metal_init: skipping kernel_l2_norm (not supported)
ggml_metal_init: skipping kernel_group_norm (not supported)
ggml_metal_init: error: load pipeline error: Error Domain=CompilerError Code=2 "Compiler encountered an internal error" UserInfo={NSLocalizedDescription=Compiler encountered an internal error}
ggml_backend_metal_device_init: error: failed to allocate context
whisper_backend_init_gpu: failed to initialize Metal backend
whisper_backend_init: using BLAS backend
whisper_init_state: kv self size = 6.29 MB
whisper_init_state: kv cross size = 18.87 MB
whisper_init_state: kv pad size = 3.15 MB
whisper_init_state: compute buffer (conv) = 16.26 MB
whisper_init_state: compute buffer (encode) = 85.86 MB
whisper_init_state: compute buffer (cross) = 4.65 MB
whisper_init_state: compute buffer (decode) = 96.35 MB

system_info: n_threads = 4 / 4 | WHISPER : COREML = 0 | OPENVINO = 0 | Metal : EMBED_LIBRARY = 1 | CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | BMI2 = 1 | ACCELERATE = 1 | AARCH64_REPACK = 1 |

whisper_print_timings: load time = 263.02 ms
whisper_print_timings: fallbacks = 0 p / 0 h
whisper_print_timings: mel time = 0.00 ms
whisper_print_timings: sample time = 0.00 ms / 1 runs ( 0.00 ms per run)
whisper_print_timings: encode time = 2357.60 ms / 1 runs ( 2357.60 ms per run)
whisper_print_timings: decode time = 1939.51 ms / 256 runs ( 7.58 ms per run)
whisper_print_timings: batchd time = 1191.79 ms / 320 runs ( 3.72 ms per run)
whisper_print_timings: prompt time = 10549.91 ms / 4096 runs ( 2.58 ms per run)
whisper_print_timings: total time = 16040.46 ms

I guess building without metal will help

...

Nope - can't seem to disable metal - aw well, remove libggml-metal?

Yep...

whisper_backend_init: using BLAS backend
whisper_init_state: kv self size = 6.29 MB
whisper_init_state: kv cross size = 18.87 MB
whisper_init_state: kv pad size = 3.15 MB
whisper_init_state: compute buffer (conv) = 16.26 MB
whisper_init_state: compute buffer (encode) = 85.86 MB
whisper_init_state: compute buffer (cross) = 4.65 MB
whisper_init_state: compute buffer (decode) = 96.35 MB

system_info: n_threads = 4 / 4 | WHISPER : COREML = 0 | OPENVINO = 0 | CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | BMI2 = 1 | ACCELERATE = 1 | AARCH64_REPACK = 1 |

whisper_print_timings: load time = 155.85 ms
whisper_print_timings: fallbacks = 0 p / 0 h
whisper_print_timings: mel time = 0.00 ms
whisper_print_timings: sample time = 0.00 ms / 1 runs ( 0.00 ms per run)
whisper_print_timings: encode time = 2270.11 ms / 1 runs ( 2270.11 ms per run)
whisper_print_timings: decode time = 1889.95 ms / 256 runs ( 7.38 ms per run)
whisper_print_timings: batchd time = 1188.71 ms / 320 runs ( 3.71 ms per run)
whisper_print_timings: prompt time = 10139.99 ms / 4096 runs ( 2.48 ms per run)
whisper_print_timings: total time = 15490.28 ms

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant