INSTALL.md
The supported way to install Faiss is through conda. Stable releases are pushed regularly to the pytorch conda channel, as well as pre-release nightly builds.
To install the latest stable release:
# CPU-only version
$ conda install -c pytorch -c conda-forge faiss-cpu=1.14.1
# GPU(+CPU) version
$ conda install -c pytorch -c nvidia -c conda-forge faiss-gpu=1.14.1
# GPU(+CPU) version with NVIDIA cuVS
$ conda install -c pytorch -c nvidia -c rapidsai -c conda-forge libnvjitlink faiss-gpu-cuvs=1.14.1
# GPU(+CPU) version using AMD ROCm not yet available
The conda-forge channel is required for up-to-date dependencies (MKL on x86-64, OpenBLAS on ARM), which are not regularly updated in the default Anaconda channel.
For faiss-gpu, the nvidia channel is additionally required for CUDA, which is not published in the main anaconda channel.
For faiss-gpu-cuvs, the rapidsai, conda-forge and nvidia channels are required.
Nightly pre-release packages can be installed as follows:
# CPU-only version
$ conda install -c pytorch/label/nightly -c conda-forge faiss-cpu
# GPU(+CPU) version
$ conda install -c pytorch/label/nightly -c nvidia -c conda-forge faiss-gpu=1.14.1
# GPU(+CPU) version with NVIDIA cuVS (package built with CUDA 12.6)
conda install -c pytorch -c rapidsai -c rapidsai-nightly -c conda-forge -c nvidia pytorch/label/nightly::faiss-gpu-cuvs 'cuda-version=12.6'
# GPU(+CPU) version using AMD ROCm not yet available
In the above commands, pytorch-cuda=11 or pytorch-cuda=12 would select a specific CUDA version, if it’s required.
A combination of versions that installs GPU Faiss with CUDA and Pytorch (as of 2024-05-15):
conda create --name faiss_1.8.0
conda activate faiss_1.8.0
conda install -c pytorch -c nvidia faiss-gpu=1.8.0 pytorch=*=*cuda* pytorch-cuda=11 numpy
Faiss can be built from source using CMake.
Faiss is supported on x86-64 machines on Linux, OSX, and Windows. It has been found to run on other platforms as well, see other platforms.
The basic requirements are:
The optional requirements are:
Indications for specific configurations are available in the troubleshooting section of the wiki.
cuVS contains state-of-the-art implementations of several algorithms for running approximate nearest neighbors and clustering on the GPU. It is built on top of the RAPIDS RAFT library of high performance machine learning primitives. Building Faiss with cuVS enabled allows a user to choose between regular GPU implementations in Faiss and cuVS implementations for specific algorithms.
The libcuvs dependency should be installed via conda:
conda install -c rapidsai -c conda-forge -c nvidia libcuvs=26.02 'cuda-version=12.6'
For more ways to install cuVS 26.02, refer to the RAPIDS Installation Guide.
Intel(R) Scalable Vector Search (SVS) is a library for high-performance vector search. Building Faiss with SVS enabled allows using SVS implementations of graph-based indices (e.g., Vamana).
The SVS library will be automatically fetched and built by CMake if FAISS_ENABLE_SVS is set to ON.
$ cmake -B build .
This generates the system-dependent configuration/build files in the build/
subdirectory.
Several options can be passed to CMake, among which:
-DFAISS_ENABLE_GPU=OFF in order to disable building GPU indices (possible
values are ON and OFF),-DFAISS_ENABLE_PYTHON=OFF in order to disable building python bindings
(possible values are ON and OFF),-DFAISS_ENABLE_CUVS=ON in order to use the NVIDIA cuVS implementations
of the IVF-Flat, IVF-PQ and CAGRA GPU-accelerated indices (default is OFF, possible, values are ON and OFF).
Note: -DFAISS_ENABLE_GPU must be set to ON when enabling this option.-DBUILD_TESTING=OFF in order to disable building C++ tests,-DBUILD_SHARED_LIBS=ON in order to build a shared library (possible values
are ON and OFF),-DFAISS_ENABLE_C_API=ON in order to enable building C API (possible values
are ON and OFF),-DFAISS_ENABLE_SVS=ON in order to enable the Intel(R) Scalable Vector Search (SVS) integration (default is OFF, possible values are ON and OFF).
Note: This will download and build the SVS runtime library (libsvs_runtime.so). When installing the python package, this library will be copied into the package directory. For C++ usage, ensure this library is in your library path.-DCMAKE_BUILD_TYPE=Release in order to enable generic compiler
optimization options (enables -O3 on gcc for instance),-DFAISS_OPT_LEVEL=avx2 in order to enable the required compiler flags to
generate code using optimized SIMD/Vector instructions. Possible values are below:
generic, avx2, avx512, and avx512_spr (for avx512 features available since Intel(R) Sapphire Rapids), by increasing order of optimization,generic and sve, by increasing order of optimization,-DFAISS_USE_LTO=ON in order to enable Link-Time Optimization (default is OFF, possible values are ON and OFF).-DBLA_VENDOR=Intel10_64_dyn -DMKL_LIBRARIES=/path/to/mkl/libs to use the
Intel MKL BLAS implementation, which is significantly faster than OpenBLAS
(more information about the values for the BLA_VENDOR option can be found in
the CMake docs),-DCUDAToolkit_ROOT=/path/to/cuda-10.1 in order to hint to the path of
the CUDA toolkit (for more information, see
CMake docs),-DCMAKE_CUDA_ARCHITECTURES="75;72" for specifying which GPU architectures
to build against (see CUDA docs to
determine which architecture(s) you should pick),-DFAISS_ENABLE_ROCM=ON in order to enable building GPU indices for AMD GPUs.
-DFAISS_ENABLE_GPU must be ON when using this option. (possible values are ON and OFF),-DPython_EXECUTABLE=/path/to/python3.7 in order to build a python
interface for a different python than the default one (see
CMake docs).$ make -C build -j faiss
This builds the C++ library (libfaiss.a by default, and libfaiss.so if
-DBUILD_SHARED_LIBS=ON was passed to CMake).
The -j option enables parallel compilation of multiple units, leading to a
faster build, but increasing the chances of running out of memory, in which case
it is recommended to set the -j option to a fixed value (such as -j4).
If making use of optimization options, build the correct target before swigfaiss.
For AVX2:
$ make -C build -j faiss_avx2
For AVX512:
$ make -C build -j faiss_avx512
For AVX512 features available since Intel(R) Sapphire Rapids.
$ make -C build -j faiss_avx512_spr
This will ensure the creation of necessary files when building and installing the python package.
$ make -C build -j swigfaiss
$ (cd build/faiss/python && python setup.py install)
The first command builds the python bindings for Faiss, while the second one generates and installs the python package.
$ make -C build install
This will make the compiled library (either libfaiss.a or libfaiss.so on
Linux) available system-wide, as well as the C++ headers. This step is not
needed to install the python package only.
To run the whole test suite, make sure that cmake was invoked with
-DBUILD_TESTING=ON, and run:
$ make -C build test
$ (cd build/faiss/python && python setup.py build)
$ PYTHONPATH="$(ls -d ./build/faiss/python/build/lib*/)" pytest tests/test_*.py
A basic usage example is available in
demos/demo_ivfpq_indexing.cpp.
It creates a small index, stores it and performs some searches. A normal runtime is around 20s. With a fast machine and Intel MKL's BLAS it runs in 2.5s.
It can be built with
$ make -C build demo_ivfpq_indexing
and subsequently ran with
$ ./build/demos/demo_ivfpq_indexing
$ make -C build demo_ivfpq_indexing_gpu
$ ./build/demos/demo_ivfpq_indexing_gpu
This produce the GPU code equivalent to the CPU demo_ivfpq_indexing. It also
shows how to translate indexes from/to a GPU.
A longer example runs and evaluates Faiss on the SIFT1M dataset. To run it,
please download the ANN_SIFT1M dataset from http://corpus-texmex.irisa.fr/
and unzip it to the subdirectory sift1M at the root of the source
directory for this repository.
Then compile and run the following (after ensuring you have installed faiss):
$ make -C build demo_sift1M
$ ./build/demos/demo_sift1M
This is a demonstration of the high-level auto-tuning API. You can try setting a different index_key to find the indexing structure that gives the best performance.
The following script extends the demo_sift1M test to several types of indexes. This must be run from the root of the source directory for this repository:
$ mkdir tmp # graphs of the output will be written here
$ python demos/demo_auto_tune.py
It will cycle through a few types of indexes and find optimal operating points. You can play around with the types of indexes.
The example above also runs on GPU. Edit demos/demo_auto_tune.py at line 100
with the values
keys_to_test = keys_gpu
use_gpu = True
and you can run
$ python demos/demo_auto_tune.py
to test the GPU code.