docs/getting_started/installation/cpu.s390x.inc.md
--8<-- [start:installation]
vLLM has experimental support for s390x architecture on IBM Z platform. For now, users must build from source to natively run on IBM Z platform.
Currently, the CPU implementation for s390x architecture supports FP32, BF16 and FP16.
--8<-- [end:installation] --8<-- [start:requirements]
Linuxgcc/g++ >= 14.0.0 or later with Command Line Toolstorchvision, llvmlite, numba, pyarrow (for testing), opencv-headless--8<-- [end:requirements] --8<-- [start:set-up-using-python]
--8<-- [end:set-up-using-python] --8<-- [start:pre-built-wheels]
Currently, there are no pre-built IBM Z CPU wheels.
--8<-- [end:pre-built-wheels] --8<-- [start:build-wheel-from-source]
Install the following packages from the package manager before building the vLLM. For example on RHEL 9.6:
dnf install -y \
which procps findutils tar vim git gcc-toolset-14 gcc-toolset-14-binutils gcc-toolset-14-libatomic-devel zlib-devel \
libjpeg-turbo-devel libtiff-devel libpng-devel libwebp-devel freetype-devel harfbuzz-devel \
openssl-devel openblas openblas-devel autoconf automake libtool cmake numpy libsndfile \
clang llvm-devel llvm-static clang-devel
Install rust>=1.80 which is needed for outlines-core and uvloop python packages installation.
curl https://sh.rustup.rs -sSf | sh -s -- -y && \
. "$HOME/.cargo/env"
Execute the following commands to build and install vLLM from source.
!!! tip
Please build the following dependencies, torchvision, llvmlite, numba, llguidance, pyarrow, opencv-headless from source before building vLLM.
uv pip install -v \
--extra-index-url https://download.pytorch.org/whl/cpu \
--torch-backend auto \
-r requirements/build/cpu.txt \
-r requirements/cpu.txt \
VLLM_TARGET_DEVICE=cpu python setup.py bdist_wheel && \
uv pip install dist/*.whl
??? console "pip"
bash pip install -v \ --extra-index-url https://download.pytorch.org/whl/cpu \ -r requirements/build/cpu.txt \ -r requirements/cpu.txt \ VLLM_TARGET_DEVICE=cpu python setup.py bdist_wheel && \ pip install dist/*.whl
--8<-- [end:build-wheel-from-source] --8<-- [start:pre-built-images]
Currently, there are no pre-built IBM Z CPU images.
--8<-- [end:pre-built-images] --8<-- [start:build-image-from-source]
docker build -f docker/Dockerfile.s390x \
--tag vllm-cpu-env .
# Launch OpenAI server
docker run --rm \
--privileged true \
--shm-size 4g \
-p 8000:8000 \
-e VLLM_CPU_KVCACHE_SPACE=<KV cache space> \
-e VLLM_CPU_OMP_THREADS_BIND=<CPU cores for inference> \
vllm-cpu-env \
--model meta-llama/Llama-3.2-1B-Instruct \
--dtype float \
other vLLM OpenAI server arguments
!!! tip
An alternative of --privileged true is --cap-add SYS_NICE --security-opt seccomp=unconfined.
--8<-- [end:build-image-from-source] --8<-- [start:extra-information] --8<-- [end:extra-information]