Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Data Modeling with Snowflake

You're reading from   Data Modeling with Snowflake A practical guide to accelerating Snowflake development using universal data modeling techniques

Arrow left icon
Product type Paperback
Published in May 2023
Publisher Packt
ISBN-13 9781837634453
Length 324 pages
Edition 1st Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Serge Gershkovich Serge Gershkovich
Author Profile Icon Serge Gershkovich
Serge Gershkovich
Arrow right icon
View More author details
Toc

Table of Contents (24) Chapters Close

Preface 1. Part 1: Core Concepts in Data Modeling and Snowflake Architecture
2. Chapter 1: Unlocking the Power of Modeling FREE CHAPTER 3. Chapter 2: An Introduction to the Four Modeling Types 4. Chapter 3: Mastering Snowflake’s Architecture 5. Chapter 4: Mastering Snowflake Objects 6. Chapter 5: Speaking Modeling through Snowflake Objects 7. Chapter 6: Seeing Snowflake’s Architecture through Modeling Notation 8. Part 2: Applied Modeling from Idea to Deployment
9. Chapter 7: Putting Conceptual Modeling into Practice 10. Chapter 8: Putting Logical Modeling into Practice 11. Chapter 9: Database Normalization 12. Chapter 10: Database Naming and Structure 13. Chapter 11: Putting Physical Modeling into Practice 14. Part 3: Solving Real-World Problems with Transformational Modeling
15. Chapter 12: Putting Transformational Modeling into Practice 16. Chapter 13: Modeling Slowly Changing Dimensions 17. Chapter 14: Modeling Facts for Rapid Analysis 18. Chapter 15: Modeling Semi-Structured Data 19. Chapter 16: Modeling Hierarchies 20. Chapter 17: Scaling Data Models through Modern Techniques 21. Index 22. Other Books You May Enjoy Appendix

The benefits of database modeling

Tactics without strategy is the noise before defeat. (Sun Tzu)

For many people, database modeling brings to mind stale diagrams, arcane symbols, or extra work at the end of a project. Only a decade ago, fueled by the rise of distributed computing in the early 2000s—which popularized the concept of big data—the notion that modeling is dead gained notoriety. More precisely, it was thought that cheap and near-limitless computing power had made planning and designing a thing of the past. It was said that flexible semi-structured data formats and the ability to parse them on the fly—known as schema-on-read—had made modeling obsolete.

Eventually, operating and maintenance costs caught up with reality and revealed two great shortcomings of the schema-on-read approach. One is that no matter how data is structured, it must be functionally bound to the business that it helps support. In other words, semi-structured formats are neither a panacea nor an excuse to forgo the process of business validation. The second—and most important—is that a model is not simply the shape that data takes once uploaded to a database, but rather, the blueprint for business operations, without which it is impossible to build sustainable architectures.

Sustainable solutions require a long-term strategy to ensure their design matches the underlying business model. Without this, schema-on-read (discussed in Chapter 15, Modeling Semi-Structured Data), star schema (discussed in Chapter 17, Scaling Data Models through Modern Techniques), or any other schema are narrow-sighted tactics that lead nowhere. But done right, modeling makes developing database architectures more agile and helps the project evolve from the idea phase to implementation. At every stage of development, the model serves as a guide for supporting the conversations necessary to propel the design into the next phase and provide additional business context. Once implemented, the model becomes a living document that helps users understand, navigate, and evolve the system it helped create.

While every organization models in the technical sense—creating tables and transforming data—not everyone models strategically, end to end, in the broad sense of the word—thereby foregoing the long-term benefits. Some of these benefits include the following:

  • Consensus and visibility of the broader business model
  • More productive conversations with business teams
  • Better quality of requirements
  • Higher signal, lower noise in technical conversations
  • Cross-platform, cross-domain, and widely understood conventions
  • Big-picture visual overview of the business and its database footprint
  • Preliminary designs become implementation blueprints
  • Accelerating onboarding of new team members
  • Making data more accessible and unlocking self-service within organizations
  • Keeping the database landscape manageable at scale
  • Getting a handle on complex data pipelines

To demonstrate the difficulties of working without formal modeling, we can take a simple schema based on Snowflake’s shared TPC-H dataset (available in the shared database called SNOWFLAKE_SAMPLE_DATA), which, at first glance, looks like this:

Figure 1.2 – A list of tables in the Snowsight UI

Figure 1.2 – A list of tables in the Snowsight UI

While these tables have been modeled in the strict sense of the word and even contain data, we get very little information on what that data represents, how it relates to data in other tables, or where it fits in the broad context of business operations.

Intuition suggests that SALES_ORDER and CUSTOMER share a relationship, but this assertion needs to be tested. Even in this trivial example of only eight tables, it will take considerable time to thoroughly sift through the data to understand its context.

The irony is that many of the details we’re looking for are already baked into the design of the physical tables, having been modeled at some point in the past. We just can’t see them. Without a map, the terrain is lost from view.

Here is the same set of tables visualized through a modeling convention called an Entity-Relationship Diagram (ERD):

Figure 1.3 – A conceptual model using crow’s foot notation

Figure 1.3 – A conceptual model using crow’s foot notation

At a glance, the big picture comes into focus. Diagrams such as this one allow us to understand the business concepts behind the data and ensure they are aligned. Having a visual model also lets us zoom out from individual tables and understand the semantics of our business: What are the individual pieces involved and how do they interact? This global perspective gives everyone in the organization a means of finding and making sense of data assets without requiring a technical background—thus, business analysts or new hires can unlock the value of the information without any help from the data team.

As the organization grows, expanding in personnel and data assets, it will inevitably become too big for any person, or even a team of people, to coordinate. Here, organizations that have embraced data modeling will stand out from those that did not. Modeling can be the thing that helps organizations scale their data landscape, or it can be the technical debt that holds them back.

Yet, for all its benefits, modeling is not a cookie-cutter solution that guarantees success. There are many approaches to modeling and various modeling methodologies that are suited for different workloads. Throughout this book, we will tackle the fundamentals of modeling that will allow you to understand these differences and apply the best solution using a first-principles approach. First, we will begin by breaking down the two main database use cases and observing the role modeling plays in each of them.

You have been reading a chapter from
Data Modeling with Snowflake
Published in: May 2023
Publisher: Packt
ISBN-13: 9781837634453
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image