Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Graph Data Modeling in Python
Graph Data Modeling in Python

Graph Data Modeling in Python: A practical guide to curating, analyzing, and modeling data with graphs

Arrow left icon
Profile Icon Gary Hutson Profile Icon Matt Jackson
Arrow right icon
₹800 per month
Full star icon Full star icon Full star icon Full star icon Half star icon 4.8 (6 Ratings)
Paperback Jun 2023 236 pages 1st Edition
eBook
₹799.99 ₹2680.99
Paperback
₹3351.99
Subscription
Free Trial
Renews at ₹800p/m
Arrow left icon
Profile Icon Gary Hutson Profile Icon Matt Jackson
Arrow right icon
₹800 per month
Full star icon Full star icon Full star icon Full star icon Half star icon 4.8 (6 Ratings)
Paperback Jun 2023 236 pages 1st Edition
eBook
₹799.99 ₹2680.99
Paperback
₹3351.99
Subscription
Free Trial
Renews at ₹800p/m
eBook
₹799.99 ₹2680.99
Paperback
₹3351.99
Subscription
Free Trial
Renews at ₹800p/m
:

What do you get with a Packt Subscription?

Free for first 7 days. ₹800 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing
Table of content icon View table of contents Preview book icon Preview Book

Graph Data Modeling in Python

Introducing Graphs in the Real World

Social network analysis, fraud detection, modeling the stability of systems (for example, rail and energy grids), and recommendation systems all rely on graphs as the lynchpin underpinning these types of networks. In each of these examples, the relationships between individual people, bank accounts, or other single units are fundamental to describe and model the data. As opposed to traditional data models, a graph is a perfect way to represent groups of interacting elements.

This chapter will serve as an introduction to why graphs are important and introduce you to the fundamentals of what makes up a graph network. Moreover, we will look at how to transition from traditional data storage strategies, such as relational databases (RDBs), to how you can use this knowledge to work with graph databases (GDBs). Throughout this book, we will be working with a popular graph database, namely Neo4j. This will be followed by an explanation of how graphs are utilized in the real world and then a gentle introduction to working with the main package workhorses, known as igraph and NetworkX, which are the best and most stable graph packages for graph data analysis and modeling.

In this chapter, we’re going to cover the following topics:

  • Why should you use graphs?
  • The fundamentals of nodes and edges and the properties of a graph
  • Comparing RDBs and GDBs
  • The use of graphs across various industries
  • Introduction to NetworkX and igraph

Technical requirements

We will be using the Jupyter Notebook to run our coding exercises. For this, you will require python>=3.8.0, along with the following packages, which will need to be installed in your environment with the pip install command:

  • networkx==2.8.8
  • igraph==0.9.8

All notebooks, along with the coding exercises, are available at the following GitHub link: https://github.com/PacktPublishing/Graph-Data-Modeling-in-Python/tree/main/CH01.

Why should you use graphs?

In modern, data-driven solutions and enterprises, graph data structures are becoming more and more common. This is because, in our modern, data-driven world, relationships between things are becoming as, if not more important, than the things themselves. In modern industries and enterprises, graphs are starting to become more common and powerful in understanding the relationships between entities. I would say that these relationships and how they are connected have become more important than the entities themselves. We will demonstrate examples of real-life graphs in our use cases in the following chapters with detailed instructions on how to build these networks and the core considerations you need to make for the graph design.

Graphs are fundamental to many systems we use every day. Each time you are online and receive a product recommendation, it is likely that a graph solution is powering this recommendation. This is why learning how to work with graph data and leveraging these types of networks is a fast-growing and key skill in data science.

Composite components of a graph

Networks are a tool to represent complex systems and the complex nature of the connections arising in today’s data. We have already referenced how graphs are powering some of the big powerhouse recommendation systems in action today.

Graph methods tend to fall into four different areas:

  • Movement: Movement is concerned with how things travel (move) through a network. These types of graphs are the drivers behind routing and GPS solutions and are utilized by the biggest players in finding the optimal path across a road network.
  • Influence: On social media, this area specifies who the known influencers are and how they propagate this influence across a network.
  • Groups and interactions: This area involves identifying groups and how actors in the network interact with each other. We will look at an example of how to apply community detection methods to find these communities through the node (the actor involved) and its connections (the edges). Don’t worry if you don’t know what these terms are for now; we will focus on these in the Fundamentals of nodes, edges, and the properties of a graph section.
  • Pattern detection: Pattern detection involves using a graph to find similarities in the network that can be explored. We must look at this from the actor’s (node’s) point of view and find similarities between that actor and other actors in the network. Here, actor is taken to mean person, author profile, and so on.

In this section, we have explained the core components of a graph by providing simple working definitions. In the following section, we will delve deeper into these fundamental elements, which make up every graph you will come across in the industry. We will look at nodes, edges, and the various properties of a graph.

The fundamentals of nodes and edges and the properties of a graph

Graphs, or networks, are particularly powerful data structures for modeling and describing relationships between things, whether these things are people, products, or machines. In a graph, those things that we coined earlier are represented by nodes (sometimes known as vertices). Nodes are connected by edges (sometimes referred to as relationships). In a network, an edge represents a relationship between two things, indicating that, somehow, they are linked.

The following sections will look at the structures and types of graphs. First, we will start with undirected graphs before moving on to directed graphs. After that, we will look at node properties, then delve into heterogeneous graphs, and end by looking at schema design considerations.

Undirected graphs

To illustrate, a simple example is that of a real-life social network. In the following example, Jeremy and Mark are each represented by a node. Jeremy is friends with Mark, and the friend relationship is represented by an edge connecting the two nodes. The following diagram shows an undirected graph:

Figure 1.1 – Two friend nodes are linked together with a single edge

Figure 1.1 – Two friend nodes are linked together with a single edge

However, not all social networks are the same, and in some online social media platforms, relationships between users of a social network may not be mutual.

For example, on Twitter, one user can follow another, but this doesn’t mean the inverse must be true. On Twitter, Jeremy may follow Mark, but Mark may not follow Jeremy.

Directed graphs

Here, a directional edge is used to show that Jeremy follows Mark, while the absence of an edge in the reverse direction shows that Mark does not follow Jeremy in return:

Figure 1.2 – Two friend nodes are linked together with a single edge

Figure 1.2 – Two friend nodes are linked together with a single edge

This type of graph is known as a directed graph. For reference, sometimes, undirected edges like those in Figure 1.1 are shown as bidirectional edges, pointing to both nodes. This bidirectional representation is equivalent to an undirected edge in most senses and represents a mutual relationship.

Importantly, when creating a data model with directional edges, naming relationships appropriately becomes important. In our Twitter example, if the edge representing the interaction between Mark and Jeremy is follows, then the edge goes from the Jeremy node to the Mark node.

On the other hand, if the edge represents a concept such as followed by, then this should be in the other direction – that is, from Mark to Jeremy. This has particularly strong implications for some more complex graph modeling and use cases, which we will cover in Chapter 2, Working with Graph Data Models.

Node properties

While nodes and edges (directional or not) are the basic building blocks of a graph, they are often not sufficient to fully describe a dataset. Nodes can have data associated with them that may not be relational, so it would not be expressed as a relationship with another node.

In these cases, to represent data associated with nodes, we can use node properties (sometimes known as node attributes). Similarly, where an edge has additional information associated with it, in addition to representing a relationship, edge properties can be used to hold that data.

The following diagram shows a black line, indicating that Jeremy follows Mark but that Mark does not follow Jeremy – therefore, the black line indicates directionality:

Figure 1.3 – Two friend nodes are linked together with a single edge

Figure 1.3 – Two friend nodes are linked together with a single edge

In the preceding model, node properties are used to describe the number of followers Mark and Jeremy have, as well as the locations listed in their Twitter bios. This kind of additional node information is particularly important for querying graph data when asking questions that involve filtering.

In our Twitter example, properties would need to be present in the graph if, for example, we wanted to know who followed users with above 1,000 followers. We will revisit answering graphical questions using nodes, edges, and properties in later chapters.

Depending on the dataset, there may be cases where different nodes have different sets of properties. In this case, it is common to have several distinct types of nodes in the same graph.

Heterogeneous graphs

Node types can also be referred to as layers, or nodes with different labels, though for this book, they will be known simply as types.

The following diagram shows the nodes representing Jeremy and Mark as people, where each node type has different properties, and there are multiple relationship types. Due to this, we can term these multiple relationships as heterogenous:

Figure 1.4 – Example of a heterogenous Twitter graph

Figure 1.4 – Example of a heterogenous Twitter graph

Now, we have added nodes representing Mark and Jeremy as people, relationships representing their relationship outside of Twitter, and their ownership of their respective accounts. Note that since we have increased the number of node types, we also need new edge types to refer to the different interactions between different types of nodes.

Graphs with multiple node types are known as heterogeneous, multilayer, or multilevel graphs, though going forward we will use the term heterogeneous to refer to graphs with multiple types of nodes. In contrast, graphs with only one node type, as in the previous examples, are referred to as homogeneous graphs.

Schema design

At this point, it is reasonable to ask the question: What features of a dataset should be nodes, edges, and properties?

For any given dataset, there are multiple ways to represent data as a graph, and each is more suited to different purposes. Herein lies the trick to good graph modeling: a question or use case-driven schema design.

If we were particularly interested in the locations of Twitter users in our network, then we could move the location node property on the Twitter user nodes to create the LOCATED_IN relationship type. This is shown in the following diagram:

Figure 1.5 – The same graph but with the location property moved from a node property to a node type

Figure 1.5 – The same graph but with the location property moved from a node property to a node type

If we were particularly interested in the locations of Twitter users in our network, then we could move the location node property on the Twitter user nodes to a separate node type and create the LOCATED_IN relationship type. We could even go one step further to represent the information we know about these locations, adding the country related to each location as a separate, abstracted node.

This graph structure models the same data in a different way, which may be more or less suitable or performant for particular use cases. We will explore the effects of schema design on the types of questions that can be asked, and performance, in later chapters.

In the next section, we will compare how graph data structures differ from traditional RDBs. This will expand on why GDBs can be more performant when modeled as a graph data problem.

Comparing RDBs and GDBs

RDBs have been a standard for data storage and data analysis across most industries for a very long time. Their strength lies in being able to hold multiple tables of different information, where some table fields are found across tables, enabling data linkage.

With this data linkage, complex questions can be asked of data in an RDB. However, there are drawbacks to this relational structure. While RDBs are useful for returning a large number of rows that meet particular criteria, they are not suited to questions involving many chained relationships.

To illustrate this, consider a standard database containing train services and their station stops, alongside a graph that might represent the same information:

Figure 1.6 – Relational data structure of trains and their stops

Figure 1.6 – Relational data structure of trains and their stops

In an RDB structure, it would not be difficult to retrieve all trains that service a particular stop. On the other hand, it may be a slow operation that returns a series of trains that can be taken between two chosen stations.

Consider the steps needed in a traditional RDB to find the route between Truro and Glasgow Central in the preceding table. Starting at Truro, we would know the GW1426 train service stops at Truro, Liskeard, and Plymouth. Knowing that these stations can be reached from Truro, we would then need to find what train services stop at each of these stations to find our route.

Upon finding that Plymouth station is reachable and that a separate service runs to many more stations, we would need to repeat this process over and over until Glasgow Central is reached.

These steps essentially result in a series of computationally costly join operations on the same table, where one resulting row would give us the path between our stations of interest.

GDBs to the rescue

Using a graph structure to represent this train network puts greater emphasis on relationships between our data points, as illustrated in the following diagram:

\

Figure 1.7 – Graph data structure of trains and their stops

Figure 1.7 – Graph data structure of trains and their stops

Using a graph structure to represent this train network puts greater emphasis on relationships between our data points. Starting from Truro station, as in the RDB example, we find the train that services that station. However, when traversing the graph to find a possible route between Truro and Glasgow Central, at each station or train node we are considering fewer data points, and therefore fewer options.

This is in contrast to the RDB example, where repeated table joins are necessary to return a path. In this case, the complexity of the operations required over the graph representation is lower, which equates to a faster, more efficient method. Among many other use cases, those that require some sort of pathfinding often benefit from a graph data model.

In addition to being more suitable for specific types of queries, graphs are typically useful where a flexible, evolving data model is needed. Again, using the example of the train network, imagine that, as the database administrator, you have received a request to add bus transport links to the data model.

With an RDB, a new table would be required, since several bus services would likely serve each train station. In this new table, the names of each station would need to be duplicated from the existing table, to list alongside their related bus services.

Not only does this duplication increase the size of data stored, but it also increases the complexity of the database schema:

Figure 1.8 – Adding a new data type (buses) to the train station graph

Figure 1.8 – Adding a new data type (buses) to the train station graph

Where the train station data is represented with a graph, the new information on buses can be added directly to the existing database as a new node type.

There is no requirement for a new table, and no need to duplicate each station node to represent the required information; the existing train nodes can be directly linked to new Bus nodes. This holds for any new data type that would require the addition of a new table in a traditional RDB.

In a graph, where new data could be represented in an equivalent RDB as a new column in an existing table, this may be a good candidate for a node property, as opposed to a new node type.

Here, an example suitable for being represented as a node property would be a code for each train station, where stations and their codes have a 1-to-1 relationship.

A comparison, in short, is captured in the following:

  • RDBs have a rigid data format and a new table must be added for a new type of data. GDBs are more flexible when it comes to the format of the data and can be extended with new node types.
  • RDBs can be queried via path-based queries – for example, how many steps between two people in a friend network, which involves multiple joins and can be extremely slow as the paths become longer. GDBs query paths directly, with no join operations, so information retrieval is more streamlined and quite frankly faster.

In summary, where the use case for a database concerns querying many relationships between objects – that is, paths – or when a flexible data schema is needed, a graph data model is likely to be a good fit to represent your data.

The use of graphs across various industries

Graph data science is prevalent across a wide array of industries.

The main areas where graphs are being used effectively are as follows:

  • Finance: To look at fraud detection and portfolio risk.
  • Government: To aid with intelligence profiling and supply chain analytics.
  • Life sciences: For looking at patient journeys through a hospital (the transition of a patient through various services), drug response rates, and the transition of infections through a population.
  • Network & IT: Security networks and user access management (nodes on a network represent each user logging into a network).
  • Telecoms: Through network optimization and churn prediction.
  • Marketing: Mainly for customer and market segmentation purposes.
  • Social media analysis: We work for a company that specializes in platform moderation, online harm protection, and brand defense. By creating graphs to defend against attacks on brands, we can find vulnerable people or moderate the most severe type of content.

In terms of graphs in industry, they are pervasive due to the reasons we have already explored in this chapter. The ability to quickly link nodes and edges, and create relationships between them, is the reason why more problems in data science are being modeled as graphs or network science problems. In addition, the underlying data can be queried at a rapid rate. This can be done instead of using traditional database solutions, which, as we have already identified, are slow to query compared to GDBs.

Following this, in the next section, we will introduce the main two driving packages for graph analytics and modeling. We will show you the basic usage of the packages. In the subsequent chapters, we will keep building on why these packages are powerful.

Introduction to NetworkX and igraph

In this chapter, we will introduce two Python packages for creating in-memory graphs: NetworkX and igraph.

NetworkX lets you create graphs, perform graph manipulation, study and visualize their structures, and perform several graph manipulation functions when working with graphs. Their website (https://networkx.org/) contains details of the major changes to the package and the intended usage of the tool.

igraph contains a suite of useful and practical analysis tools, with the aim being to make these efficient and easy to use, in a reproducible way. What is great about igraph is that it is open source and free, plus it supports networks to be built in R, Python, Mathematica, and C/C++. This is our recommended package for creating large networks that can load much more quickly than NetworkX. To read more about igraph, go to https://igraph.org/.

In the following subsections, we will look at the basics of both NetworkX and igraph, with easy-to-follow coding steps. This is the first time you are going to get your hands dirty with graph data modeling.

NetworkX basics

NetworkX is one of the originally available graph libraries for Python and is particularly focused on being user-friendly and Pythonic in its style. It also natively includes methods for calculating some classic network analysis measures:

  1. To import NetworkX into Python, use the following command:
    import networkx as nx
  2. And to create an empty graph, g, use the following command:
    g = nx.Graph()
  3. Now, we need to add nodes to our graph, which can be done using methods of the Graph object belonging to g. There are multiple ways to do this, with the simplest being adding one node at a time:
    g.add_node(Jeremy)
  4. Alternatively, multiple nodes can be added to the graph at once, like so:
    g.add_nodes_from([Mark, Jeremy])
  5. Properties can be added to nodes during creation by passing a node and dictionary tuple to Graph.add_nodes_from:
    g.add_nodes_from([(Mark, {followers: 2100}), (Jeremy, {followers: 130})])
  6. To add an edge to the graph, we can use the Graph.add_edge method, and reference the nodes already present in the graph:
    g.add_edge(Jeremy, Mark)

It is worth noting that, in NetworkX, when adding an edge, any nodes specified as part of that edge not already in the graph will be added implicitly.

  1. To confirm that our graph now contains nodes and edges, we may want to plot it, using matplotlib and networkx.draw(). The with_labels parameter adds the names of the nodes to the plot:
    import matplotlib.pyplot as plt
    nx.draw(g, with_labels=True)
    plt.show()

This section showed you how you can get up and running with NetworkX in a couple of lines of Python code. In the next section, we will turn our focus to the popular igraph package, which allows us to perform calculations over larger datasets much quicker than using the popular NetworkX.

igraph basics

NetworkX, while user-friendly, suffers from slow speeds when using larger graphs. This is due to its implementation behind the scenes and because it is written in Python, with some C, C++, and FORTRAN.

In contrast, igraph is implemented in pure C, giving the library an advantage when working with large graphs and complex network algorithms. While not as immediately accessible as NetworkX for beginners, igraph is a useful tool to have under your belt when code efficiency is paramount.

Initially, working with igraph is very similar to working with NetworkX. Let’s take a look:

  1. To import igraph into Python, use the following command:
    import igraph as ig
  2. And to create an empty graph, g, use the following command:
    g = ig.Graph()

In contrast to NetworkX, in igraph, all nodes have a prescribed internal integer ID. The first node that’s added has an ID of 0, with all subsequent nodes assigned increasing integer IDs.

  1. Similar to NetworkX, changes can be made to a graph by using the methods of a Graph object. Nodes can be added to the graph with the Graph.add_vertices method (note that a vertex is another way to refer to a node). Two nodes can be added to the graph with the following code:
    g.add_vertices(2)
  2. This will add nodes 0 and 1 to the graph. To name them, we have to assign properties to the nodes. We can do this by accessing the vertices of the Graph object. Similar to how you would access elements of a list, each node’s properties can be accessed by using the following notation. Here, we are setting the name and followers attributes of nodes 0 and 1:
    g.vs[0][name] = Jeremy
    g.vs[1][name] = Mark
    g.vs[0][followers] = 130
    g.vs[1][followers] = 2100
  3. Node properties can also be added listwise, where the first list element corresponds to node ID 0, the second to node ID 1, and so on. The following two lines are equivalent to the four lines shown in step 4:
    g.vs["name"] = [Jeremy, Mark]
    g.vs[followers] = [130, 2100]
  4. To add an edge, we can use the Graph.add_edges() method:
    g.add_edges([(0, 1)])

Here, we are only adding one edge, but additional edges can be added to the list parameter required by add_edges. As with NetworkX, if edges are added for nodes that are not currently in the graph, nodes will be created implicitly. However, since igraph requires nodes to have sequential IDs, attempting to add the edge pair (1, 3) to a graph with two vertices will fail.

Summary

In this chapter, we looked at why you should start to think graph, from the benefits of why these methods are becoming the most widely utilized and discussed approaches in various industries. We looked at what a graph is and explained the various types of graphs, such as graphs that are concerned with how things move through a network, to influence graphs (who is influencing who on social media), to graph methods to identify groups and interactions, and how graphs can be utilized to detect patterns in a network.

Moving on from that, we examined the fundamentals of what makes up a graph. Here, we looked at the fundamental elements of nodes, edges, and properties and delved into the difference between an undirected and directed graph. Additionally, we examined the properties of nodes, looked at heterogeneous graphs, and examined the types of schema contained within a graph.

This led to how GDBs compare to legacy RDBs and why, in many cases, graphs are much easier and faster to transverse and query, with examples of how graphs can be utilized to optimize the stops on a train journey and how this can be extended, with ease, to add bus stops as well, as a new data source.

Following this, we looked at how graphs are being deployed across various industries and some use cases for why graphs are important in those industries, such as fraud detection in the finance sector, intelligence profiling in the government sector, patient journeys in hospitals, churn across networks, and customer segmentation in marketing. Graphs truly are becoming ubiquitous across various industries.

We wrapped up this chapter by providing an introduction to the powerhouses of graph analytics and network analysis – igraph and NetworkX. We showed you how, in a few lines of Python code, you can easily start to populate a graph.

In the next chapter, we will look at how to work with and create graph data models. The next chapter will contain many more hands-on examples of how to structure your data using graph data models in Python.

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Transform relational data models into graph data model while learning key applications along the way
  • Discover common challenges in graph modeling and analysis, and learn how to overcome them
  • Practice real-world use cases of community detection, knowledge graph, and recommendation network

Description

Graphs have become increasingly integral to powering the products and services we use in our daily lives, driving social media, online shopping recommendations, and even fraud detection. With this book, you’ll see how a good graph data model can help enhance efficiency and unlock hidden insights through complex network analysis. Graph Data Modeling in Python will guide you through designing, implementing, and harnessing a variety of graph data models using the popular open source Python libraries NetworkX and igraph. Following practical use cases and examples, you’ll find out how to design optimal graph models capable of supporting a wide range of queries and features. Moreover, you’ll seamlessly transition from traditional relational databases and tabular data to the dynamic world of graph data structures that allow powerful, path-based analyses. As well as learning how to manage a persistent graph database using Neo4j, you’ll also get to grips with adapting your network model to evolving data requirements. By the end of this book, you’ll be able to transform tabular data into powerful graph data models. In essence, you’ll build your knowledge from beginner to advanced-level practitioner in no time.

Who is this book for?

If you are a data analyst or database developer interested in learning graph databases and how to curate and extract data from them, this is the book for you. It is also beneficial for data scientists and Python developers looking to get started with graph data modeling. Although knowledge of Python is assumed, no prior experience in graph data modeling theory and techniques is required.

What you will learn

  • Design graph data models and master schema design best practices
  • Work with the NetworkX and igraph frameworks in Python Store, query, ingest, and refactor graph data
  • Store your graphs in memory with Neo4j
  • Build and work with projections and put them into practice
  • Refactor schemas and learn tactics for managing an evolved graph data model

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Jun 30, 2023
Length: 236 pages
Edition : 1st
Language : English
ISBN-13 : 9781804618035
Category :
Languages :
Concepts :
Tools :
:

What do you get with a Packt Subscription?

Free for first 7 days. ₹800 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing

Product Details

Publication date : Jun 30, 2023
Length: 236 pages
Edition : 1st
Language : English
ISBN-13 : 9781804618035
Category :
Languages :
Concepts :
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
₹800 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
₹4500 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just ₹400 each
Feature tick icon Exclusive print discounts
₹5000 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just ₹400 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total 9,160.97 10,054.97 894.00 saved
Causal Inference and Discovery in Python
₹2084.99 ₹2978.99
Graph Data Modeling in Python
₹3351.99
Hands-On Graph Neural Networks Using Python
₹3723.99
Total 9,160.97 10,054.97 894.00 saved Stars icon
Banner background image

Table of Contents

15 Chapters
Part 1: Getting Started with Graph Data Modeling Chevron down icon Chevron up icon
Chapter 1: Introducing Graphs in the Real World Chevron down icon Chevron up icon
Chapter 2: Working with Graph Data Models Chevron down icon Chevron up icon
Part 2: Making the Graph Transition Chevron down icon Chevron up icon
Chapter 3: Data Model Transformation – Relational to Graph Databases Chevron down icon Chevron up icon
Chapter 4: Building a Knowledge Graph Chevron down icon Chevron up icon
Part 3: Storing and Productionizing Graphs Chevron down icon Chevron up icon
Chapter 5: Working with Graph Databases Chevron down icon Chevron up icon
Chapter 6: Pipeline Development Chevron down icon Chevron up icon
Chapter 7: Refactoring and Evolving Schemas Chevron down icon Chevron up icon
Part 4: Graphing Like a Pro Chevron down icon Chevron up icon
Chapter 8: Perfect Projections Chevron down icon Chevron up icon
Chapter 9: Common Errors and Debugging Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon
Other Books You May Enjoy Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Full star icon Full star icon Half star icon 4.8
(6 Ratings)
5 star 83.3%
4 star 16.7%
3 star 0%
2 star 0%
1 star 0%
Filter icon Filter
Top Reviews

Filter reviews by




Jacob Silva Mar 07, 2024
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Great book! It quickly teaches a beginner about graph data with easy-to-understand content, hands-on exercises, and provided source code. Then, it touches on advanced topics like projections on graph data. I am happy we picked this book for our Tech Book Club
Amazon Verified review Amazon
Amanda Fetch Sep 26, 2023
Full star icon Full star icon Full star icon Full star icon Full star icon 5
As someone who is currently working with a knowledge graph based solution within an organization, it is imperative that I find solid resources to cover both the basics and more advanced concepts of graph data modeling. "Graph Data Modeling in Python" starts with the basics of graph structures and their real-world applications, then the book systematically covers practical use-cases from making recommendations using Python to designing complex pipelines with Neo4j and Cypher. Its in-depth coverage on transforming relational databases to graph databases, building knowledge graphs, and addressing common errors I found very useful. A perfect blend of theory and hands-on examples makes it an indispensable resource for anyone venturing into the world of graph databases. Highly recommended for both beginners and seasoned professionals!
Amazon Verified review Amazon
Jacob Vandergriff Sep 28, 2023
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Graph Data Modeling in Python provides a great introduction for those familiar with python and relational databases.The flow from relational databases, to python graph modeling, to scaling up to Neo4j databases was exceptional and can get any beginner to a graph developer.
Amazon Verified review Amazon
Om S Jul 05, 2023
Full star icon Full star icon Full star icon Full star icon Full star icon 5
The book begins by highlighting the growing importance of graphs in our daily lives, driving various applications such as social media, recommendation systems, and fraud detection. Readers are introduced to the benefits of graph data models, which enhance efficiency and uncover hidden insights through complex network analysis.With a focus on practical use cases, readers learn how to design optimal graph models capable of supporting a wide range of queries and features. The transition from traditional relational databases to dynamic graph data structures is seamlessly addressed, empowering readers to unlock the power of path-based analyses. The book also covers working with Neo4j for persistent graph database management.By the end of the book, readers will possess the skills to transform tabular data into powerful graph data models, ranging from beginner to advanced-level proficiency. They will have a deep understanding of schema design best practices, proficiency in NetworkX and igraph frameworks, and the ability to store, query, ingest, and refactor graph data."Graph Data Modeling in Python: A Practical Guide to Curating, Analyzing, and Modeling Data with Graphs" is a must-read for those interested in graph databases and data modeling. The book caters to individuals with prior knowledge of Python, while also being accessible to beginners in graph data modeling. Its practical approach, real-world examples, and coverage of common errors and debugging make it an essential companion for those looking to extract valuable insights from graph data.
Amazon Verified review Amazon
SEAN W. GRANT Aug 18, 2023
Full star icon Full star icon Full star icon Full star icon Full star icon 5
This was a good book to read that helped in understanding how to build a graph from various sources (rdbms tables, NLP outputs). It showed how you can use some common python libraries (NLTK, Spacy) to process some data and create a graph from it. There were a couple chapters that I especially liked: Chapter 4 and 7.Chapter 4- Building Knowledge Graphs: Author does a great job of explaining what a knowledge graph is, how to design one and then walk through how to build one. Knowledge graphs are becoming a popular topic and structure that businesses are using to provide better understanding and this chapter will help you understand how you can start looking at your data to build a knowledge graph.Chapter 7-Refactoring and Evolving Schemas: I found this to be a great explanation of power of the graph database and how it can adjust fast to your needs without causing a problem with live data. The hands-on examples were great at showing how easy it is to change the schema in the Neo4j database.Overall this was a great book for getting your hands dirty on designing, understand and building graphs that will help you think about how connected data really is.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is included in a Packt subscription? Chevron down icon Chevron up icon

A subscription provides you with full access to view all Packt and licnesed content online, this includes exclusive access to Early Access titles. Depending on the tier chosen you can also earn credits and discounts to use for owning content

How can I cancel my subscription? Chevron down icon Chevron up icon

To cancel your subscription with us simply go to the account page - found in the top right of the page or at https://subscription.packtpub.com/my-account/subscription - From here you will see the ‘cancel subscription’ button in the grey box with your subscription information in.

What are credits? Chevron down icon Chevron up icon

Credits can be earned from reading 40 section of any title within the payment cycle - a month starting from the day of subscription payment. You also earn a Credit every month if you subscribe to our annual or 18 month plans. Credits can be used to buy books DRM free, the same way that you would pay for a book. Your credits can be found in the subscription homepage - subscription.packtpub.com - clicking on ‘the my’ library dropdown and selecting ‘credits’.

What happens if an Early Access Course is cancelled? Chevron down icon Chevron up icon

Projects are rarely cancelled, but sometimes it's unavoidable. If an Early Access course is cancelled or excessively delayed, you can exchange your purchase for another course. For further details, please contact us here.

Where can I send feedback about an Early Access title? Chevron down icon Chevron up icon

If you have any feedback about the product you're reading, or Early Access in general, then please fill out a contact form here and we'll make sure the feedback gets to the right team. 

Can I download the code files for Early Access titles? Chevron down icon Chevron up icon

We try to ensure that all books in Early Access have code available to use, download, and fork on GitHub. This helps us be more agile in the development of the book, and helps keep the often changing code base of new versions and new technologies as up to date as possible. Unfortunately, however, there will be rare cases when it is not possible for us to have downloadable code samples available until publication.

When we publish the book, the code files will also be available to download from the Packt website.

How accurate is the publication date? Chevron down icon Chevron up icon

The publication date is as accurate as we can be at any point in the project. Unfortunately, delays can happen. Often those delays are out of our control, such as changes to the technology code base or delays in the tech release. We do our best to give you an accurate estimate of the publication date at any given time, and as more chapters are delivered, the more accurate the delivery date will become.

How will I know when new chapters are ready? Chevron down icon Chevron up icon

We'll let you know every time there has been an update to a course that you've bought in Early Access. You'll get an email to let you know there has been a new chapter, or a change to a previous chapter. The new chapters are automatically added to your account, so you can also check back there any time you're ready and download or read them online.

I am a Packt subscriber, do I get Early Access? Chevron down icon Chevron up icon

Yes, all Early Access content is fully available through your subscription. You will need to have a paid for or active trial subscription in order to access all titles.

How is Early Access delivered? Chevron down icon Chevron up icon

Early Access is currently only available as a PDF or through our online reader. As we make changes or add new chapters, the files in your Packt account will be updated so you can download them again or view them online immediately.

How do I buy Early Access content? Chevron down icon Chevron up icon

Early Access is a way of us getting our content to you quicker, but the method of buying the Early Access course is still the same. Just find the course you want to buy, go through the check-out steps, and you’ll get a confirmation email from us with information and a link to the relevant Early Access courses.

What is Early Access? Chevron down icon Chevron up icon

Keeping up to date with the latest technology is difficult; new versions, new frameworks, new techniques. This feature gives you a head-start to our content, as it's being created. With Early Access you'll receive each chapter as it's written, and get regular updates throughout the product's development, as well as the final course as soon as it's ready.We created Early Access as a means of giving you the information you need, as soon as it's available. As we go through the process of developing a course, 99% of it can be ready but we can't publish until that last 1% falls in to place. Early Access helps to unlock the potential of our content early, to help you start your learning when you need it most. You not only get access to every chapter as it's delivered, edited, and updated, but you'll also get the finalized, DRM-free product to download in any format you want when it's published. As a member of Packt, you'll also be eligible for our exclusive offers, including a free course every day, and discounts on new and popular titles.