airflow-core/docs/installation/index.rst
.. Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
.. http://www.apache.org/licenses/LICENSE-2.0
.. Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
.. contents:: :local:
.. toctree:: :maxdepth: 1 :caption: Installation :hidden:
Prerequisites <prerequisites>
Dependencies <dependencies>
Supported versions <supported-versions>
Installing from sources <installing-from-sources>
Installing from PyPI <installing-from-pypi>
Setting up the database <setting-up-the-database>
Upgrading <upgrading>
Upgrading to Airflow 3 <upgrading_to_airflow3>
This page describes installation options that you might use when considering how to install Airflow®. Airflow consists of many components, often distributed among many physical or virtual machines, therefore installation of Airflow might be quite complex, depending on the options you choose.
You should also check out the :doc:Prerequisites <prerequisites> that must be fulfilled when installing Airflow
as well as :doc:Supported versions <supported-versions> to know what are the policies for the supporting
Airflow, Python and Kubernetes.
Airflow requires additional :doc:Dependencies <dependencies> to be installed - which can be done
via extras and providers.
When you install Airflow, you need to :doc:setup the database <setting-up-the-database> which must
also be kept updated when Airflow is upgraded.
Local start for development and testing '''''''''''''''''''''''''''''''''''''''
You just want to try Apache Airflow without all production complexity? If you have pipx installed,
you can install Airflow directly from PyPI with the command below:
.. code-block:: bash
pipx run apache-airflow standalone
Alternatively similar with Astral uv:
.. code-block:: bash
uvx apache-airflow standalone
Which starts a minimal system with an auto-generated admin password and SQLite database, so you can start using Airflow right away. This is a great way to get familiar with Airflow and try it out without the need to set up a complex environment.
Note that the standalone mode is not for production purposes. But it is a simple start for a local development.
Using released sources ''''''''''''''''''''''
More details: :doc:installing-from-sources
When this option works best
Apache Software Foundation <https://www.apache.org/>.
It is a requirement for all ASF projects that they can be installed using official sources released via Official Apache Downloads <https://dlcdn.apache.org/>.verify the integrity and provenance of the software <https://www.apache.org/dyn/closer.cgi#verify>__Intended users
What are you expected to handle
airflow db commands,
automated startup and recovery, maintenance, cleanup and upgrades of Airflow and the Airflow Providers.What Apache Airflow Community provides for that method
instructions <https://github.com/apache/airflow/blob/main/INSTALL>__ on how to build the software but due to various environments
and tools you might want to use, you might expect that there will be problems which are specific to your deployment and environment
you will have to diagnose and solve.Where to ask for help
The #user-troubleshooting channel on Slack can be used for quick general troubleshooting questions. The
GitHub discussions <https://github.com/apache/airflow/discussions>__ if you look for longer discussion and have more information to share.
The #user-best-practices channel on Slack can be used to ask for and share best practices on using and deploying Airflow.
If you can provide description of a reproducible problem with Airflow software, you can open issue at GitHub issues <https://github.com/apache/airflow/issues>_
If you want to contribute back to Airflow, the #contributors Slack channel for building the Airflow itself
Using PyPI '''''''''''
More details: :doc:/installation/installing-from-pypi
When this option works best
This installation method is useful when you are not familiar with Containers and Docker and want to install Apache Airflow on physical or virtual machines and you are used to installing and running software using custom deployment mechanisms.
The only officially supported mechanism of installation is via pip using constraint mechanisms. The constraint
files are managed by Apache Airflow release managers to make sure that you can repeatably install Airflow from PyPI with all Providers and
required dependencies.
In case of PyPI installation you could also verify integrity and provenance of the packages downloaded from PyPI as described at the installation page, but software you download from PyPI is pre-built for you so that you can install it without building, and you do not build the software from sources.
Intended users
What are you expected to handle
airflow db commands,
automated startup and recovery, maintenance, cleanup and upgrades of Airflow and Airflow Providers.What Apache Airflow Community provides for that method
/installation/installing-from-pypi
on how to install the software but due to various environments and tools you might want to use, you might
expect that there will be problems which are specific to your deployment and environment you will have to
diagnose and solve./start where you can see an example of Quick Start with running Airflow
locally which you can use to start Airflow quickly for local testing and development.
However, this is just for inspiration. Do not expect :doc:/start is ready for production installation,
you need to build your own production-ready deployment if you follow this approach.Where to ask for help
#user-troubleshooting channel on Airflow Slack for quick general
troubleshooting questions. The GitHub discussions <https://github.com/apache/airflow/discussions>__
if you look for longer discussion and have more information to share.#user-best-practices channel on Slack can be used to ask for and share best
practices on using and deploying Airflow.GitHub issues <https://github.com/apache/airflow/issues>__Using Production Docker Images ''''''''''''''''''''''''''''''
More details: :doc:docker-stack:index
When this option works best
This installation method is useful when you are familiar with Container/Docker stack. It provides a capability of running Airflow components in isolation from other software running on the same physical or virtual machines with easy maintenance of dependencies.
The images are built by Apache Airflow release managers and they use officially released packages from PyPI and official constraint files - same that are used for installing Airflow from PyPI.
Intended users
What are you expected to handle
docker-compose) and to make sure that they are linked together.airflow db commands,
automated startup and recovery, maintenance, cleanup and upgrades of Airflow and the Airflow Providers.What Apache Airflow Community provides for that method
docker-stack:build on how to build and customize your image./howto/docker-compose/index where you can see an example of Quick Start which
you can use to start Airflow quickly for local testing and development. However, this is just for inspiration.
Do not expect to use this docker-compose.yml file for production installation, you need to get familiar
with Docker Compose and its capabilities and build your own production-ready deployment with it if
you choose Docker Compose for your deployment.Where to ask for help
#production-docker-image channel in Airflow Slack.#user-troubleshooting channel on Airflow Slack for quick general
troubleshooting questions. The GitHub discussions <https://github.com/apache/airflow/discussions>__
if you look for longer discussion and have more information to share.#user-best-practices channel on Slack can be used to ask for and share best
practices on using and deploying Airflow.GitHub issues <https://github.com/apache/airflow/issues>__Using Official Airflow Helm Chart '''''''''''''''''''''''''''''''''
More details: :doc:helm-chart:index
When this option works best
Intended users
What are you expected to handle
What Apache Airflow Community provides for that method
docker-stack:build on how to build and customize your image.helm-chart:index - full documentation on how to configure and install the Helm Chart.Where to ask for help
#production-docker-image channel in Airflow Slack.#helm-chart-official channel in Slack.#user-troubleshooting channel on Airflow Slack for quick general
troubleshooting questions. The GitHub discussions <https://github.com/apache/airflow/discussions>__
if you look for longer discussion and have more information to share.#user-best-practices channel on Slack can be used to ask for and share best
practices on using and deploying Airflow.GitHub issues <https://github.com/apache/airflow/issues>__Using Managed Airflow Services ''''''''''''''''''''''''''''''
Follow the Ecosystem <https://airflow.apache.org/ecosystem/>__ page to find all Managed Services for Airflow.
When this option works best
Intended users
What are you expected to handle
What Apache Airflow Community provides for that method
Where to ask for help
Using 3rd-party images, charts, deployments '''''''''''''''''''''''''''''''''''''''''''
Follow the Ecosystem <https://airflow.apache.org/ecosystem/>__ page to find all 3rd-party deployment options.
When this option works best
Intended users
What are you expected to handle
What Apache Airflow Community provides for that method
Where to ask for help
Notes about minimum requirements ''''''''''''''''''''''''''''''''
There are often questions about minimum requirements for Airflow for production systems, but it is not possible to give a simple answer to that question.
The requirements that Airflow might need depend on many factors, including (but not limited to):
The above "Dag" characteristics will change over time and even will change depending on the time of the day or week, so you have to be prepared to continuously monitor the system and adjust the parameters to make it works smoothly.
While we can provide some specific minimum requirements for some development "quick start" - such as
in case of our :ref:running-airflow-in-docker quick-start guide, it is not possible to provide any minimum
requirements for production systems.
The best way to think of resource allocation for Airflow instance is to think of it in terms of process control theory - where there are two types of systems:
Fully predictable, with few knobs and variables, where you can reliably set the values for the knobs and have an easy way to determine the behaviour of the system
Complex systems with multiple variables, that are hard to predict and where you need to monitor the system and adjust the knobs continuously to make sure the system is running smoothly.
Airflow (and generally any modern systems running usually on cloud services, with multiple layers responsible for resources as well multiple parameters to control their behaviour) is a complex system and it fall much more in the second category. If you decide to run Airflow in production on your own, you should be prepared for the monitor/observe/adjust feedback loop to make sure the system is running smoothly.
Having a good monitoring system that will allow you to monitor the system and adjust the parameters is a must to put that in practice.
There are a few guidelines that you can use for optimizing your resource usage as well. The
:ref:fine-tuning-scheduler is a good starting point to fine-tune your scheduler, you can also follow
the :ref:best_practice guide to make sure you are using Airflow in the most efficient way.
Also, one of the important things that Managed Services for Airflow provide is that they make a lot of opinionated choices and fine-tune the system for you, so you don't have to worry about it too much. With such managed services, there are usually far less number of knobs to turn and choices to make and one of the things you pay for is that the Managed Service provider manages the system for you and provides paid support and allows you to scale the system as needed and allocate the right resources - following the choices made there when it comes to the kinds of deployment you might have.