Summary
In this chapter, we spent time learning the basics of what data orchestration is and what problems companies and engineers are facing today. In addition, we introduced Apache Airflow, the leading data orchestration and workflow management tool. We also covered what you can expect over the course of this book. It is important to remember that Apache Airflow requires multiple baseline tools and knowledge areas to be most successful. Although these areas are needed for best use, each is a learnable topic and can be picked up at a quick pace.
At the core of Airflow use is Python code. To be the best data engineer using Airflow, you need to understand the core concepts of Python code and how it will orchestrate your stack of data tools. Taking time to review these core concepts and understand the use cases that are being tackled by Airflow will lead to scalable systems of code and optimization opportunities.
In the next chapter, we will introduce the basics of DAGs and tasks...