What this book covers
Chapter 1, From Data as a Byproduct to Data as a Product, shows how modularizing data architecture with data products solves recurring problems that make its sustainable evolution challenging over time.
Chapter 2, Data Products, defines what a data product is, outlining its key characteristics and explaining the essential components that make it up, highlighting how each element contributes to its overall function and value.
Chapter 3, Data Product-Centered Architectures, explores the foundational principles of a data product-centered architecture, analyzing the key operational and organizational capabilities required to manage it. We also compare other modern approaches such as data meshes and data fabrics with the data-as-product paradigm to highlight their similarities and key differences.
Chapter 4, Identifying Data Products and Prioritizing Developments, explains how to identify and prioritize data products using a value-driven approach. It starts by identifying relevant business cases through domain-driven design and event storming, then shows how to define the data products needed to support those business cases.
Chapter 5, Designing and Implementing Data Products, explores the process of designing a data product based on identified requirements, starting with techniques for defining scope, interfaces, and ecosystem relationships. It then examines the core components of a data product, their development process, and how to describe them with machine-readable documents. Finally, it analyzes the data flow, focusing on components responsible for sourcing, processing, and serving data.
Chapter 6, Operating Data Products in Production, covers the entire lifecycle of a data product, from release to decommissioning. It introduces CI/CD methodologies, explores managing a data product in production with a focus on governance, observability, and access control, and discusses techniques for evolving and reusing data products in a distributed environment.
Chapter 7, Automating Data Product Lifecycle Management, explains how to speed up the adoption of a data product-centric paradigm by creating a self-serve platform to mobilize the entire data ecosystem. It covers the platform’s main features, how it improves the experience for developers, operators, and consumers, and the key factors in deciding whether to build, buy, or use a hybrid approach in implementing it.
Chapter 8, Moving through the Adoption Journey, covers the adoption of the data-as-a-product paradigm. It outlines the key phases of the process, exploring objectives, challenges, and activities for each stage. Finally, it discusses how to create a flexible data strategy that evolves with each phase, building on previous learnings.
Chapter 9, Team Topologies and Data Ownership at Scale, explains how to design an organizational structure for managing data as a product. It introduces the team topologies framework, including team types and interaction modes, and explores how to organize teams for efficient data product delivery. Finally, it looks at how to integrate these teams into the organization and decide between the centralized or decentralized data management model.
Chapter 10, Distributed Data Modeling, examines data modeling in a decentralized, data product-centered architecture. It defines data models and emphasizes intentionality in modeling, then examines physical modeling techniques for distributed environments. Finally, it covers conceptual data modeling and its role in guiding the design and evolution of data products within a cohesive ecosystem.
Chapter 11, Building an AI-Ready Information Architecture, explores how to build an information architecture that maximizes the value of managed data, starting with developed data products. It covers how different planes of the information architecture add context to data and focuses especially on the knowledge plane, where shared conceptual models ensure semantic interoperability between data products. Finally, it explores how federated modeling teams can create and link conceptual models to physical data, forming an enterprise knowledge graph crucial for unlocking the potential of generative AI.
Chapter 12, Bringing It All Together, revisits key concepts from earlier chapters, tying them to the core beliefs about data management that inspired this book. It wraps up with practical advice for becoming a more successful data management practitioner.