Installing Apache Airflow on Linux or Mac
First things first, Apache Airflow needs Python to be installed. Preferably, 3.12+
We will also use pip / pip3 command to install Airflow.
Initial setup
- sudo pip3 install virtualenv (install virtualenv so that we can create different environments and Python installations don't clash)
- mkdir airflow_workspace (make a separate directory for airflow related work)
- cd airflow_workspace
- virtualenv airflow_env (creating virtual environment)
- source airflow_env/bin/activate (activate the virtual environment so that we can use it. Your prompt will change now)
- pip3 install apache-airflow (using pip or pip3 python installer, install airflow)
Initialize and setup basics (create the first admin user to access UI etc)
- airflow db init (initialize database - sqlLite in this basic setup will be used, comes packaged with airflow)
- mkdir dags (create a directory for storing directed acyclic grpahs - the workflows)
- Add an Admin user to access the workflow UI -- airflow users create --username admin --password your_password --firstname your_first_name --lastname your_last_name --role Admin --email your_email@some.com
- airflow users list (list users and confirm our admin user got created)
Start airflow scheduler
- airflow scheduler (Start scheduler in the same terminal)
Start airflow webserver (UI) in the new terminal (after activating the virtual env)
- cd airflow_workspace
- source airflow_env/bin/activate
- airflow webserver