Introduction
In the previous chapter, you learned how to use Apache Spark to process large amounts of data in a pipeline architecture. We'll be looking at a pipeline design again in this chapter and see how we can use pipelines as a powerful system design.
Between 1998 and 2005, the US Navy spent over $1 billion on four separate attempts to implement an Enterprise Resource Planning (ERP) system based on SAP AG software. These efforts were regarded as failures, with nearly no value to show for the money that was spent. This shows why having proper designs and plans in place is important, and what can go wrong if the implementation of a system is started before a proper plan is created. Bad design and planning are such a common problem in the software engineering industry that it has led to the much-quoted joke, "A few weeks of coding can save you hours of planning."
While it's often tempting to start building systems from the get-go, this approach makes it...