Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Apache Airflow Best Practices

You're reading from   Apache Airflow Best Practices A practical guide to orchestrating data workflow with Apache Airflow

Arrow left icon
Product type Paperback
Published in Oct 2024
Publisher Packt
ISBN-13 9781805123750
Length 188 pages
Edition 1st Edition
Languages
Arrow right icon
Authors (3):
Arrow left icon
Dylan Storey Dylan Storey
Author Profile Icon Dylan Storey
Dylan Storey
Dylan Intorf Dylan Intorf
Author Profile Icon Dylan Intorf
Dylan Intorf
Kendrick van Doorn Kendrick van Doorn
Author Profile Icon Kendrick van Doorn
Kendrick van Doorn
Arrow right icon
View More author details
Toc

Table of Contents (20) Chapters Close

Preface 1. Part 1: Apache Airflow: History, What, and Why
2. Chapter 1: Getting Started with Airflow 2.0 FREE CHAPTER 3. Chapter 2: Core Airflow Concepts 4. Part 2: Airflow Basics
5. Chapter 3: Components of Airflow 6. Chapter 4: Basics of Airflow and DAG Authoring 7. Part 3: Common Use Cases
8. Chapter 5: Connecting to External Sources 9. Chapter 6: Extending Functionality with UI Plugins 10. Chapter 7: Writing and Distributing Custom Providers 11. Chapter 8: Orchestrating a Machine Learning Workflow 12. Chapter 9: Using Airflow as a Driving Service 13. Part 4: Scale with Your Deployed Instance
14. Chapter 10: Airflow Ops: Development and Deployment 15. Chapter 11: Airflow Ops Best Practices: Observation and Monitoring 16. Chapter 12: Multi-Tenancy in Airflow 17. Chapter 13: Migrating Airflow 18. Index 19. Other Books You May Enjoy

Component configuration

All of the components within Airflow will be shared by all users (and DAGs) of a deployment. There are ways to potentially use operational paradigms and conventions to isolate information and provide a semblance of security, but these will likely not be able to be strictly enforced as a policy.

The Celery Executor

The Celery Executor launches multiple threads that pick up work from the broker for execution. By design, each thread shares CPU, memory, and local disk space within the worker, so neighboring workloads have a possibility of colliding and clobbering each other. If you wish to direct workloads to specific workers (or isolate them), you can utilize the queueing mechanism to do so. To do so, you need to ensure that each worker starts with a queue name that it is listening to (airflow celery worker -q queue_name). You can then assign tasks to specific queues by assigning the queue name to the operator using the queue keyword argument.

Each worker...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image