You're reading from Data Engineering with Apache Spark, Delta Lake, and Lakehouse Create scalable pipelines that ingest, curate, and aggregate complex data in a timely and secure way

Product type Paperback

Published in Oct 2021

Publisher Packt

ISBN-13 9781801077743

Length 480 pages

Edition 1st Edition

Languages

Python

Tools

Apache Spark

Concepts

Data Engineering

Author (1):

Manoj Kukreja

View More author details

Table of Contents (17) Chapters

Preface

1. Section 1: Modern Data Engineering and Tools

2. Chapter 1: The Story of Data Engineering and Analytics FREE CHAPTER

3. Chapter 2: Discovering Storage and Compute Data Lakes

4. Chapter 3: Data Engineering on Microsoft Azure

5. Section 2: Data Pipelines and Stages of Data Engineering

6. Chapter 4: Understanding Data Pipelines

7. Chapter 5: Data Collection Stage – The Bronze Layer

8. Chapter 6: Understanding Delta Lake

9. Chapter 7: Data Curation Stage – The Silver Layer

10. Chapter 8: Data Aggregation Stage – The Gold Layer

11. Section 3: Data Engineering Challenges and Effective Deployment Strategies

12. Chapter 9: Deploying and Monitoring Pipelines in Production

13. Chapter 10: Solving Data Engineering Challenges

14. Chapter 11: Infrastructure Provisioning

15. Chapter 12: Continuous Integration and Deployment (CI/CD) of Data Pipelines

16. Other Books You May Enjoy

Chapter 11: Infrastructure Provisioning

While the demand for data analytics grows, data engineers are becoming an expensive and hard-to-find commodity in the marketplace. On the other hand, organizations that hire data engineers are finding innovative methods to do more with less so that they can justify the high resource costs. In a recent trend, most of these organizations have started to use automated infrastructure provisioning as a means to streamline cloud deployments. Until recently, infrastructure provisioning work has typically been handled by the DevOps group, but not anymore.

In the previous chapter, we talked about the dynamic nature of the data engineer's job profile. The modern data engineer needs to keep up with this latest trend and train themselves in a few DevOps skills. This chapter and the next are designed to teach the data engineer a few critical DevOps skills.

Important Note

In today's job market, a data engineer who knows DevOps is a lethal...