© Armstrong Subero 2020Armstrong SuberoCodeless Data Structures and Algorithms https://doi.org/10.1007/978-1-4842-5725-8_5
Basse Terre, Moruga, Trinidad and Tobago
In the previous chapter, we looked at hashes, hashing, and hash tables. In this chapter, I will introduce an important mathematical concept that we will need to understand to comprehend the operation of later algorithms. That concept is graph theory. Graph theory can become involved and complex; however, for the sake of understanding, we will take the 20,000-foot view and avoid getting bogged down in the details. Graphs are essential to many parts of computer science, and once you understand how they work, you will be able to apply them to solve a wide array of problems. Mathematical gurus need not read this chapter, but for anyone else, it’s worthwhile to understand the basic concepts I present here.
Dimensions, Points, and Lines
Before we delve into graph theory, I think we should review three important mathematical concepts, that of the dimension, the point, and the line. Yes, I know that you already learned about them in high school; however, they form a solid base for us to build upon. While these seem elementary and unrelated to the topic at hand, they are essential to understanding graphs in computer science.
When discussing the point and the line, I think it’s best we do so from the perspective of a mathematical dimension. In mathematics, when we refer to a dimension, what we are referring to is the distance in a direction. This is something you should be vaguely familiar with as when we are talking about distance, this distance may be length, width, or height.
Common geometrical shapes have several dimensions with the most common having two or three dimensions. Two-dimensional shapes have the dimensions of length and height, and three-dimensional shapes have all three dimensions of length, width, and height. With this in mind, let’s move forward.
A point is a unique concept, because a point has no dimensions, only a position. Remember that we said a dimension is a distance in a direction. Well, you can think of a point as the starting position. To know where points are, we use a dot to represent them. Figure 5-1 shows a point.
While a point exists with no dimensions, at the first dimension we have a line. A line is a point that has simply been moved in one direction. Figure 5-2 shows a line.
When we have two lines meeting each other, we get what is called a vertex; the plural form of vertex is vertices. In Figure 5-3, the point where the two lines meet is the vertex.
While this knowledge comes from basic geometry, it has very real applications in DSA as it provides the foundation for us to understand more complex topics, as we will see in the next section.
Computers are all about connections; in fact, computer networking is all about connecting computers. From this basic concept of connecting things comes the graph. What a graph does is show the connections between nodes. Graphs are important because they provide a visual means for us to see the connections between objects.
Graphs are a kind of overlapping gray area where you can really see the connection between computer science and mathematics. This branch of mathematics is called graph theory , and it has a lot of applications in computer science.
Graphs vs. Trees
The best way to think about graphs is to compare them to something we already know about. In this case, since we already examined trees in an earlier chapter, we can think of the graph as a tree with no root; additionally, a graph has no nodes that can be identified as a parent or child. We can also think of it as a chaotic graph where each child node has multiple parents. Figure 5-4 shows a tree data structure.
Tree data structure
While the tree feels very structured, the graph feels more expressive and closer to a real-world representation of connections because unlike trees, which can represent multiple paths between only two nodes, graphs can give a better model of the representation of real world connections. In real life, living and nonliving objects have multiple interactions with many things, and graphs can more accurately represent and model such relationships.
Some people like to think about trees as a kind of minimalist graph. This is because in real-world development, most of the algorithms that will work on graphs will work on trees because trees are basically graphs without any cycles or cyclic interaction taking place.
More About Graphs
The nodes on a graph are sometimes called objects or vertices . The links connecting the vertices are known as edges . Vertices that are connected by an edge are said to be adjacent to another one.
These edges may be directed from one node to another; when this happens, we call this graph a directed graph . It is easy to identify directed graphs with the arrows on their edges.
An undirected graph is one in which there are no directed edges, and because there are no directed edges, you can traverse edges both ways between the nodes. Figure 5-5 shows an undirected graph.
An undirected graph
A directed graph is one in which the edges are directed, which is to say the edges connecting the vertices on the graph each have a direction associated with them. We call these edges directed edges, and sometimes we call the directed edges arrows. These directed graphs are also called digraphs . Figure 5-6 shows us a directed graph.
A directed graph
To get from one of the vertices of the graph to another, we follow what is known as a path along the edges of the graph. If we take a vertex of a graph and draw an edge from that vertex to itself, we get what is called a loop. In a loop, the initial and final vertices coincide.
When a graph exists such that it is a graph that exists within a larger graph, we call that graph a subgraph .
Basically, that’s all a graph is, a bunch of circles being connected by a bunch of lines. What makes graphs useful is that they allow us to see the relationship between objects (nodes). This property of graphs makes them useful for many things, and as we will see later, a lot of algorithms based on searching are dependent on graphs.
We have already established that graph theory is all about connecting a bunch of circles to a bunch of lines. The edges, as we know, are the links connecting the nodes on the graph.
Sometimes these edges have a weight. This means there is a number of values associated with each edge on the graph. When edges have a weight associated with them, we call this type of graph a weighted graph. Figure 5-7 shows us what a directed graph looks like.
A weighted graph
While the graph in Figure 5-7 is an undirected graph, it is important to note that weighted graphs may also be directed. Weighted graphs find a lot of applications in computer science, and many algorithms are dependent on weighted graphs.
Graphs and Social Networking Applications
While the theory presented in this chapter may seem dull and boring, there is a way to see the importance of graphs within real-world applications by looking at how graphs will operate within a program designed as a social networking site.
On a social networking site, someone will have a profile. This profile will contain personal information about someone, such as where they grew up, their age, pictures of them, and where they went to school.
We can think of this profile of a person as a node on a graph. Let’s look at an example person named John; John is a lone node, as shown in Figure 5-8.
If we had only John on our site, then it wouldn’t be a networking site, would it? It would be a web page about John. To keep the site from being all about John, each person on a social networking site will be friends with another person on the site. Let’s say John and Alex both went to the same elementary school and connected online and are now friends with each other on the application. They would them both be connected nodes on the graph, as in Figure 5-9.
John and Alex connection
Alex is friends with Sue, and Sue now joins the graph, as in Figure 5-10, as the circle of friends keeps growing.
Growing circle of friends
Now imagine Alex meets Jill at a business meeting, Jill and John meet at a dinner party, and all of them are now friends with each other on the application. Our graph of connections will now resemble something like in Figure 5-11.
The circle keeps growing.
In real-world social networking applications, variants of a simple graph just like this one can be used as the basis of the application.
The Graph Database
Another direct application of graphs and graph theory is in the design and mechanics of the graph database.
When storing data, we have a choice between using a file system or using a database management system. Storing data in a file system has its drawbacks as it is limited in operation, lacks uniformity in terms of format, leads to data duplication, and lacks security.
To fix the problems of a file system, the relational database management system (RDBMS) was created, which fixes the problems of the file system by allowing data to be stored in tables consisting of rows and columns. There is also a schema in the DBMS that describes the data being stored as well as the attributes of the table within the database.
What makes the relational database system useful is that there is a key system in place that allows tables to have relationships with one another as each table has a primary key, and relationships between tables are set using foreign keys.
A problem with relational databases arises when scaling the database. The computing time of some operations between connected tables can be very computing intensive. Graph databases fix many of the problems associated with the RDBMS by using a graph data structure for organization. There are many graph databases for different functions. Should you need more information on graph databases, I recommend the book Practical Neo4j by Gregory Jordan (Apress, 2014).
In this chapter, we took a brief look at graphs. We reviewed the basics of dimensions, points, and lines. We also discussed graphs and touched on weighted graphs. We briefly talked about two applications of graphs in social networking applications and databases. In the next chapter, we will delve into algorithms as we cover linear search and binary search.