A short introduciton to Airflow and quick list of useful references.
Tested Configuration:
Linux: Ubuntu 18.04
Quickstart
I recommend the official documentation.
But, you may also find this tutorial a good complementary : towardsdatascience
What to do next, well you can have a look at the Airflow example pipeline or check their “how to” out.
Tip:
in production you will want to use the demonized version of airflow scheduler. It’s easy, simply do airflow scheduler -D
instead of airflow scheduler
Change the SQLite default database for airflow
As a first step you will need to install some airflow plugins. Airflow maintains a list plugin here.
The official documentation is pretty good BUT : you may find these two ressources useful in order to do it
- about
sql_alchemy_conn
: medium - don’t forget to change the LocalExecutor
TroubleShooting
bug with airflow scheduler
-
During the launch of
airflow scheduler
if you encounter the bug like “Cannot allocate memory”

 :
I suggest the answer 2 (SWAP) as a first testing measuer stackoverflow. However note that in production it will slow down a lot. -
During the launch of
airflow scheduler
, if nothing happens (When you open your webserver, it says the scheduler isn’t running):
got to your airflow folder and remove airflow-scheduler.pid as explained here
Airflow logs take a lot of space on my disk
If you decide to keep your logs on your server (you can store them online instead, like on S3 for instance) you may encounter this issue : disk full. Indeed Airflow produces a lot of logs, and they can fill quickly your disk. On solution for that is to create DAGs to remove old logs. This issue was described and solved here. Note: you’ll find other interesting scripts for Airflow maintenance [here] (https://github.com/teamclairvoyant/airflow-maintenance-dags/).
Reference
Official tutorial : airflow
“Cannot allocate memory”

 : stackoverflow -> allow swap
sql_alchemy_conn issue : medium
Another interesting quickstart : rosiehoyem
remove PID of airflow scheduler : stackoverflow
Further tips : airflow maintenance dags