Skip to content

Commit 50218b9

Browse files
authored
build : Add Moore Threads GPU support and update GitHub workflow for MUSA build (#3069)
* Update PATH for main/main-cuda container Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * Add Dockerfile for musa, .dockerignore and update CI Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * Add Moore Threads GPU Support in README.md and replace ./main with whisper-cli Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * Forward GGML_CUDA/GGML_MUSA to cmake in Makefile Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * Minor updates for PATH ENV in Dockerfiles Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * Address comments Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> --------- Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
1 parent f9b2dfd commit 50218b9

File tree

7 files changed

+62
-7
lines changed

7 files changed

+62
-7
lines changed

.devops/main-cuda.Dockerfile

+3-3
Original file line numberDiff line numberDiff line change
@@ -13,8 +13,6 @@ WORKDIR /app
1313
ARG CUDA_DOCKER_ARCH=all
1414
# Set nvcc architecture
1515
ENV CUDA_DOCKER_ARCH=${CUDA_DOCKER_ARCH}
16-
# Enable cuBLAS
17-
ENV GGML_CUDA=1
1816

1917
RUN apt-get update && \
2018
apt-get install -y build-essential libsdl2-dev wget cmake git \
@@ -25,7 +23,8 @@ ENV CUDA_MAIN_VERSION=12.3
2523
ENV LD_LIBRARY_PATH /usr/local/cuda-${CUDA_MAIN_VERSION}/compat:$LD_LIBRARY_PATH
2624

2725
COPY .. .
28-
RUN make base.en
26+
# Enable cuBLAS
27+
RUN make base.en CMAKE_ARGS="-DGGML_CUDA=1"
2928

3029
FROM ${BASE_CUDA_RUN_CONTAINER} AS runtime
3130
ENV CUDA_MAIN_VERSION=12.3
@@ -37,4 +36,5 @@ RUN apt-get update && \
3736
&& rm -rf /var/lib/apt/lists/* /var/cache/apt/archives/*
3837

3938
COPY --from=build /app /app
39+
ENV PATH=/app/build/bin:$PATH
4040
ENTRYPOINT [ "bash", "-c" ]

.devops/main-musa.Dockerfile

+29
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
ARG UBUNTU_VERSION=22.04
2+
# This needs to generally match the container host's environment.
3+
ARG MUSA_VERSION=rc3.1.1
4+
# Target the MUSA build image
5+
ARG BASE_MUSA_DEV_CONTAINER=mthreads/musa:${MUSA_VERSION}-devel-ubuntu${UBUNTU_VERSION}
6+
# Target the MUSA runtime image
7+
ARG BASE_MUSA_RUN_CONTAINER=mthreads/musa:${MUSA_VERSION}-runtime-ubuntu${UBUNTU_VERSION}
8+
9+
FROM ${BASE_MUSA_DEV_CONTAINER} AS build
10+
WORKDIR /app
11+
12+
RUN apt-get update && \
13+
apt-get install -y build-essential libsdl2-dev wget cmake git \
14+
&& rm -rf /var/lib/apt/lists/* /var/cache/apt/archives/*
15+
16+
COPY .. .
17+
# Enable muBLAS
18+
RUN make base.en CMAKE_ARGS="-DGGML_MUSA=1"
19+
20+
FROM ${BASE_MUSA_RUN_CONTAINER} AS runtime
21+
WORKDIR /app
22+
23+
RUN apt-get update && \
24+
apt-get install -y curl ffmpeg wget cmake git \
25+
&& rm -rf /var/lib/apt/lists/* /var/cache/apt/archives/*
26+
27+
COPY --from=build /app /app
28+
ENV PATH=/app/build/bin:$PATH
29+
ENTRYPOINT [ "bash", "-c" ]

.devops/main.Dockerfile

+1
Original file line numberDiff line numberDiff line change
@@ -16,4 +16,5 @@ RUN apt-get update && \
1616
&& rm -rf /var/lib/apt/lists/* /var/cache/apt/archives/*
1717

1818
COPY --from=build /app /app
19+
ENV PATH=/app/build/bin:$PATH
1920
ENTRYPOINT [ "bash", "-c" ]

.dockerignore

+3
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
build*/
2+
.github/
3+
.devops/

.github/workflows/docker.yml

+1
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ jobs:
1818
matrix:
1919
config:
2020
- { tag: "main", dockerfile: ".devops/main.Dockerfile", platform: "linux/amd64" }
21+
- { tag: "main-musa", dockerfile: ".devops/main-musa.Dockerfile", platform: "linux/amd64" }
2122
#TODO: the cuda image keeps failing - disable for now
2223
# https://github.com/ggerganov/whisper.cpp/actions/runs/11019444428/job/30602020339
2324
#- { tag: "main-cuda", dockerfile: ".devops/main-cuda.Dockerfile", platform: "linux/amd64" }

Makefile

+2-2
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
.PHONY: build
66
build:
7-
cmake -B build
7+
cmake -B build $(CMAKE_ARGS)
88
cmake --build build --config Release
99

1010
# download a few audio samples into folder "./samples":
@@ -41,7 +41,7 @@ samples:
4141

4242
tiny.en tiny base.en base small.en small medium.en medium large-v1 large-v2 large-v3 large-v3-turbo:
4343
bash ./models/download-ggml-model.sh $@
44-
cmake -B build
44+
cmake -B build $(CMAKE_ARGS)
4545
cmake --build build --config Release
4646
@echo ""
4747
@echo "==============================================="

README.md

+23-2
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@ High-performance inference of [OpenAI's Whisper](https://github.com/openai/whisp
2323
- [Efficient GPU support for NVIDIA](#nvidia-gpu-support)
2424
- [OpenVINO Support](#openvino-support)
2525
- [Ascend NPU Support](#ascend-npu-support)
26+
- [Moore Threads GPU Support](#moore-threads-gpu-support)
2627
- [C-style API](https://github.com/ggml-org/whisper.cpp/blob/master/include/whisper.h)
2728

2829
Supported platforms:
@@ -381,6 +382,25 @@ Run the inference examples as usual, for example:
381382
- If you have trouble with Ascend NPU device, please create a issue with **[CANN]** prefix/tag.
382383
- If you run successfully with your Ascend NPU device, please help update the table `Verified devices`.
383384

385+
## Moore Threads GPU support
386+
387+
With Moore Threads cards the processing of the models is done efficiently on the GPU via muBLAS and custom MUSA kernels.
388+
First, make sure you have installed `MUSA SDK rc3.1.1`: https://developer.mthreads.com/sdk/download/musa?equipment=&os=&driverVersion=&version=rc3.1.1
389+
390+
Now build `whisper.cpp` with MUSA support:
391+
392+
```
393+
cmake -B build -DGGML_MUSA=1
394+
cmake --build build -j --config Release
395+
```
396+
397+
or specify the architecture for your Moore Threads GPU. For example, if you have a MTT S80 GPU, you can specify the architecture as follows:
398+
399+
```
400+
cmake -B build -DGGML_MUSA=1 -DMUSA_ARCHITECTURES="21"
401+
cmake --build build -j --config Release
402+
```
403+
384404
## FFmpeg support (Linux only)
385405

386406
If you want to support more audio formats (such as Opus and AAC), you can turn on the `WHISPER_FFMPEG` build flag to enable FFmpeg integration.
@@ -425,6 +445,7 @@ We have two Docker images available for this project:
425445

426446
1. `ghcr.io/ggml-org/whisper.cpp:main`: This image includes the main executable file as well as `curl` and `ffmpeg`. (platforms: `linux/amd64`, `linux/arm64`)
427447
2. `ghcr.io/ggml-org/whisper.cpp:main-cuda`: Same as `main` but compiled with CUDA support. (platforms: `linux/amd64`)
448+
3. `ghcr.io/ggml-org/whisper.cpp:main-musa`: Same as `main` but compiled with MUSA support. (platforms: `linux/amd64`)
428449

429450
### Usage
430451

@@ -437,11 +458,11 @@ docker run -it --rm \
437458
docker run -it --rm \
438459
-v path/to/models:/models \
439460
-v path/to/audios:/audios \
440-
whisper.cpp:main "./main -m /models/ggml-base.bin -f /audios/jfk.wav"
461+
whisper.cpp:main "whisper-cli -m /models/ggml-base.bin -f /audios/jfk.wav"
441462
# transcribe an audio file in samples folder
442463
docker run -it --rm \
443464
-v path/to/models:/models \
444-
whisper.cpp:main "./main -m /models/ggml-base.bin -f ./samples/jfk.wav"
465+
whisper.cpp:main "whisper-cli -m /models/ggml-base.bin -f ./samples/jfk.wav"
445466
```
446467

447468
## Installing with Conan

0 commit comments

Comments
 (0)