python/1_GettingStarted/numpyVsCupy/README.md
This sample demonstrates performance comparison between NumPy (CPU) and CuPy (GPU) for matrix multiplication operations. It benchmarks the execution time of matrix dot products on both CPU and GPU, showing the performance benefits of GPU acceleration for numerical computations.
cp.asarray().np.testing.assert_allclose().device.create_stream().stream.close() in try/finally blocks.numpycupycuda-coreFrom cuda.core:
Device() – Get CUDA device object for specific GPUdevice.create_stream() – Create explicit CUDA streamstream.close() – Close and cleanup stream resourcesInstall packages:
pip install -r requirements.txt
Basic usage:
# Pre-steps:
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
# Run from the Python directory:
cd /path/to/numpyVsCupy/Python
python -m 1_GettingStarted.numpyVsCupy.numpyVsCupy
With custom parameters:
python -m 1_GettingStarted.numpyVsCupy.numpyVsCupy --n_size 5000
--n_size, -n: Size of the matrix (n * n) for benchmarking (default: 4096)Validation PASSED: NumPy and CuPy results match within tolerance
Demo completed successfully!
numpyVsCupy.py – Python implementationREADME.md – This filerequirements.txt – Required packages