Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Learning Neo4j 3.x

You're reading from   Learning Neo4j 3.x Effective data modeling, performance tuning and data visualization techniques in Neo4j

Arrow left icon
Product type Paperback
Published in Oct 2017
Publisher Packt
ISBN-13 9781786466143
Length 316 pages
Edition 2nd Edition
Languages
Tools
Arrow right icon
Authors (2):
Arrow left icon
Jerome Baton Jerome Baton
Author Profile Icon Jerome Baton
Jerome Baton
Rik Van Bruggen Rik Van Bruggen
Author Profile Icon Rik Van Bruggen
Rik Van Bruggen
Arrow right icon
View More author details
Toc

Table of Contents (17) Chapters Close

Preface 1. Graph Theory and Databases 2. Getting Started with Neo4j FREE CHAPTER 3. Modeling Data for Neo4j 4. Getting Started with Cypher 5. Awesome Procedures on Cypher - APOC 6. Extending Cypher 7. Query Performance Tuning 8. Importing Data into Neo4j 9. Going Spatial 10. Security 11. Visualizations for Neo4j 12. Data Refactoring with Neo4j 13. Clustering 14. Use Case Example - Recommendations 15. Use Case Example - Impact Analysis and Simulation 16. Tips and Tricks

Introducing Neo4j 3.x and a history of graphs

Many people have used the word graph at some point in their professional or personal lives. However, chances are that they did not use it in the way that we will be using it in this book. Most people--obviously not you, otherwise you probably would not have picked up this book--actually think about something very different when talking about a graph. They think about pie charts and bar charts. They think about graphics, not graphs.

In this book, we will be working with a completely different type of subject--the graphs that you might know from your math classes. I, for one, distinctly remember being taught the basics of discrete mathematics in one of my university classes, and I also remember finding it terribly complex and difficult to work with. Little did I know that my later professional career would use these techniques in a software context, let alone that I would be writing a book on this topic.

So, what are graphs? To explain this, I think it is useful to put a little historic context around the concept. Graphs are actually quite old as a concept. They were invented, or at least first described, in an academic paper by the well-known Swiss mathematician, Leonhard Euler. He was trying to solve an age-old problem that we now know as the Seven Bridges of Königsberg. The problem at hand was pretty simple to understand.

Königsberg was a beautiful medieval city in the Prussian Empire situated on the river Pregel. It is located between Poland and Lithuania in today's Russia. If you try to look it up on any modern-day map, you will most likely not find it as it is currently known as Kaliningrad. The Pregel not only cut Königsberg into left- and right-bank sides of the city, but it also created an island in the middle of the river, which was known as the Kneiphof. The result of this peculiar situation was a city that was cut into four parts (we will refer to them as A, B, C, and D), which were connected by seven bridges (labelled a, b, c, d, e, f, and g in the following diagram). This gives us the following situation:

  • The seven bridges are connected to the four different parts of the city
  • The essence of the problem that people were trying to solve was to take a tour of the city, visiting every one of its parts and crossing every single one of its bridges, without having to walk a single bridge or street twice

In the following diagram, you can see how Euler illustrated this problem in his original 1736 paper:

 Illustration of the problem as mentioned by Euler in his paper in 1736

Essentially, it was a pathfinding problem, like many others (for example, the knight's ride problem, or the traveling salesman problem). It does not seem like a very difficult assignment at all now, does it? However, at the time, people really struggled with it and were trying to figure it out for the longest time. It was not until Euler got involved and took a very different, mathematical approach to the problem that it got solved once and for all.

Euler did the following two things that I find really interesting:

  • First and foremost, he decided not to take the traditional brute force method to solve the problem (in this case, drawing a number of different route options on the map and trying to figure out--essentially by trial and error--if there was such a route through the city), but to do something different. He took a step back and took a different look at the problem by creating what I call an abstract version of the problem at hand, which is essentially a model of the problem domain that he was trying to work with. In his mind, at least, Euler must have realized that the citizens of Königsberg were focusing their attention on the wrong part of the problem--the streets. Euler quickly came to the conclusion that the streets of Königsberg did not really matter to find a solution to the problem. The only things that mattered for his pathfinding operation were the following:
    • The parts of the city
    • The bridges connecting the parts of the city

Now, all of a sudden, we seem to have a very different problem at hand, which can be accurately represented in what is often regarded as the world's first graph:



Simplifying Königsberg
  • Secondly, Euler solved the puzzle at hand by applying a mathematical algorithm on the model that he created. Euler's logic was simple--if I want to take a walk in the town of Königsberg, then I will have to do as follows:
    • I will have to start somewhere in any one of the four parts of the city
    • I will have to leave that part of the city; in other words, I will have to cross one of the bridges to go to another part of the city
    • I will then have to cross another five bridges, leaving and entering different parts of the city
    • Finally, I will end the walk through Königsberg in another part of the city

Therefore, Euler argues, the case must be that the first and last parts of the city have an odd number of bridges that connect them to other parts of the city (because you leave from the first part and you arrive at the last part of the city), but the other two parts of the city must have an even number of bridges connecting them to the first and last parts of the city, because you will arrive and leave from these parts of the city.

This number of bridges connecting the parts of the city has a very special meaning in the model that Euler created, the graph representation of the model. We call this the degree of the nodes in the graph. In order for there to be a path through Königsberg that only crossed every bridge once, Euler proved that all he had to do was to apply a very simple algorithm that would establish the degree (in other words, count the number of bridges) of every part of the city. This is shown in the following diagram:

Simplified town

This is how Euler solved the famous Seven Bridges of Königsberg problem. By proving that there was no part of the city that had an even number of bridges, he also proved that the required walk in the city could not be done. Adding one more bridge would immediately make it possible, but with the state of the city and its bridges at the time, there was no way one could take such Eulerian Walk of the city.

By doing so, Euler created the world's first graph. The concepts and techniques of his research, however, are universally applicable; in order to do such a walk on any graph, the graph must have zero or two vertices with odd degrees and all intermediate vertices must have even degree.

To summarize, a graph is nothing more than an abstract, mathematical representation of two or more entities, which are somehow connected or related to each other. Graphs model pairwise relations between objects. They are, therefore, always made up of the following components:

  • The nodes of the graph, usually representing the objects mentioned previously: In math, we usually refer to these structures as vertices; but for this book and in the context of graph databases such as Neo4j, we will always refer to vertices as nodes.
  • The links between the nodes of the graph: In math, we refer to these structures as edges, but again, for the purpose of this book, we will refer to these links as relationships.
  • The structure of how nodes and relationships are connected to each other makes a graph: Many important qualities, such as the number of edges connected to a node (what we referred to as degrees), can be assessed. Many other such indicators also exist.

Now that we have discussed graphs and understand a bit more about their nature and history, it's time to look at the discipline that was created on top of these concepts, often referred to as the graph theory.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image