docs/2.developers/4.user-guide/60.deployment/35.jupyter-docker.md
Jupyter notebooks and Docker are two convenient ways to run Pathway code. Jupyter notebooks are useful for exploration and interactive development while Docker is used for deployment.
In notebooks, bash commands, code, and explanations are intertwined.
To successfully run a notebook with Docker, you need to extract the shell commands, such as pip install and make Docker run them separately.
⚠️ In Jupyter notebooks the exclamation mark (!) allows users to run shell commands from inside a Jupyter Notebook code cell. Those commands should be removed and added to the Dockerfile in order to be able to run the Jupyter notebook as a regular Python file.
This tutorial will show you how to easily convert your Jupyter notebook to make it work with Docker by following those steps:
The Docker deployment will be done using a Dockerfile.
This file contains the instructions used by Docker to build a container image.
Pathway comes with its own Docker image.
To use it, you can simply create a simple file called Dockerfile and use the pathway image with FROM:
FROM pathwaycom/pathway:latest
COPY . .
CMD [ "python", "./your-script.py" ]
You can also use a regular Python image, you can learn more about it in the dedicated article.
In Jupyter notebooks, dependencies are installed using a code cell and the exclamation mark (!) to run pip install bash commands.
For example, suppose you want to install langchain, langchain_community, and lanchain_openai.
In a Jupyter notebook, you would create a code cell like this:
!pip install langchain
!pip install langchain_community
!pip install lanchain_openai
This cell would not work in a regular Python file. You need to remove those lines and install the dependencies via the Dockerfile.
There are two main approaches to manage dependencies in a Dockerfile:
pip install commands.requirements.txt file.pip install commandsIf you have a small number of dependencies, you can directly list the installation commands within the Dockerfile:
FROM pathwaycom/pathway:latest
RUN pip install langchain
RUN pip install langchain_community
RUN pip install lanchain_openai
COPY . .
CMD [ "python", "./your-script.py" ]
Replace langchain, langchain_community, and langchain_openai with the actual libraries your code uses.
requirements.txt fileFor a larger number of dependencies, consider creating a requirements.txt file that lists them:
langchain
langchain_community
langchain_openai
Then, update your Dockerfile to install dependencies from this file:
FROM pathwaycom/pathway:latest
# Copy requirements file and install dependencies
COPY requirements.txt ./
RUN pip install -r ./requirements.txt
COPY . .
CMD [ "python", "./your-script.py" ]
Choose the method that best suits your project's complexity.
Similarly to dependencies, you may have other bash commands in your Python code that you may want to execute.
For example, suppose that your Jupyter notebook downloads data using the following cell:
!wget -nc https://your-data-add.com/data
This will not work if your Jupyter notebook is executed as a regular file. You need to remove this line and add the command to the Dockerfile:
FROM pathwaycom/pathway:latest
# Copy requirements file and install dependencies
COPY requirements.txt ./
RUN pip install -r ./requirements.txt
RUN wget -nc https://your-data-add.com/data
COPY . .
CMD [ "python", "./your-script.py" ]
⚠️ Note that the command does not have the exclamation mark (!) anymore.
You should do this step for each shell command in your Jupyter notebook.
You need to convert your .ipynb file to a regular .py Python file.
You can do it directly from JupyterLab by File -> Save and Export Notebook as... -> Executable Script.
You need to remove any code specifically used for the interactive notebook environment (e.g., displaying visualizations). Don't forget to remove all the shell commands.
To switch your example from static to streaming, you need to:
pw.debug references.mode="streaming").pw.run().To learn more about how to switch from batch to streaming, read our dedicated tutorial.
Now that your Dockerfile and your Python files are ready, you can then build and run the Docker image:
docker build -t my-pathway-app .
docker run -it --rm --name my-pathway-app my-pathway-app