Graphs, part 1

The concept of graphs

A graph in computer science consists of a set of nodes (also called vertices (that is the plural, the singular is "vertex")) and a set of edges (also called arcs) where any edge must join two nodes from the set of nodes. Sometimes graphs are directed, that is we consider the edge a-b to be a different thing from the edge b-a. One way of dealing with this is to represent all graphs as directed, but for an undirected one ensure that whenever there is an edge a-b in the set of edges there is always also the edge b-a.

For example, the graph below:

could be represented by the set of nodes

{a,b,c,d,e,f}

and the set of edges

{a-b,a-c,b-d,b-e,c-e,d-c,d-f,e-f}

However, there are more efficient ways of representing graphs than as sets of edges. The representation chosen may depend on what operations we wish to perform on a graph. There are many applications of graphs in computer science, and many algorithms for manipulating and finsing properties of graphs. A whole course could easily be given on graph theory (actually, one is in the Maths department), but we will only have time to consider a few simple aspects.

A graph representation in Java

One way of representing graphs is to use an adjacency list. An adjacent list consists of a collection of lists each of which is labelled with one of the nodes. The individual items in the list are those nodes to which the node which labels the list connects. So, for example, the above graph could be represented by the following structure:

where the variable adjList refers to the whole graph. As you can see, this structure consists of cells of two different sorts. One holds a node name, a reference to a cell of the same sort, and a reference to the cell of the other sort. The other just holds a node name and a referene to a cell of the same sort.

Here is some code that gives this structure:

import java.io.*;
import java.util.*;

class Graph1
{
 // Undirected graph, constructed from file, uses adjacency list

 private ACell adjList;

 public Graph1(BufferedReader file)
 throws IOException
 {
  String line = file.readLine();
  adjList=null;
  while(!line.equals(""))
     {
      NCell nlist,ptr=null;
      StringTokenizer toks = new StringTokenizer(line);
      String first = toks.nextToken();
      if(toks.hasMoreTokens())
         {
          nlist = new NCell(toks.nextToken());
          ptr=nlist;
         }
      else
         nlist=null;
      while(toks.hasMoreTokens())
         {
          ptr.next = new NCell(toks.nextToken());
          ptr = ptr.next;
         }
      adjList = new ACell(first,nlist,adjList);
      line = file.readLine();
     }
 }

 public Enumeration getLinks(String from)
 {
  return new NextNodes(from);
 }

 private static class NCell
 {
  String first;
  NCell next;
 
  NCell(String n)
  {
   first=n;
   next=null;
  }
 }

 private static class ACell
 {
  String from;
  NCell to;
  ACell next;
 
  ACell(String f,NCell t,ACell n)
  {
   from=f;
   to=t;
   next=n;
  }
 }

 private class NextNodes implements Enumeration
 {
  NCell nlist;

  NextNodes(String from)
  {
   ACell ptr = adjList;
   for(;ptr!=null&&!ptr.from.equals(from); ptr=ptr.next);
   if(ptr==null)
      nlist=null;
   else
      nlist=ptr.to;
  }

  public boolean hasMoreElements()
  {
   return nlist!=null;
  }

  public Object nextElement()
  {
   String n=nlist.first;
   nlist=nlist.next;
   return n;
  }
 }

}

The two types of cell are represented by the static inner classes ACell and NCell respectively. The technique we have used previously for lists and trees is used once again: the reference to the actual structure is given by a private field in the class.

In this case, the only public method provided by the graph class, apart from its constructor, is getLinks. This takes as its argument a string naming a node, and returns an Enumeration which can be used to obtain strings naming all the nodes to which the argument node is connected by an edge in the graph. The type of the Enumeration object returned by the method getLinks is given by an inner class which is not static because it has to make reference to the adjList field of the particular Graph1 object the method is called on.

So if the variable g of type Graph1 refers to a representation of the graph given above, we can create an enumeration which wil give us all the links to node b by the call g.getLinks("b"). Although the enumeration will be of type NextNodes as given inside class Graph1, that type cannot even be referred to outside that class. However, as it is a subtype of Enumeration, a variable of type Enumeration can be set to refer to it, as in:

Enumeration enum = g.getLinks("b");

After this, the first time nextElement is called on enum it will return "d", and the second time it is called it will return "e". However, as the return type of nextElement as given in its definition in Java's library is Object, it must be typecast to String. If nextElement is called a third time on enum it will cause an exception, but it should not be called until after hasMoreElements has been called on enum and returned true. It will return true until nextElement has been called twice on enum, then it will always return false.

We generally use enumerations in loops, so for example if n is a variable of type String, then the following would print the names of all nodes connected by an edge to the node named by n in a graph referred to by g:

Enumeration enum = g.getLinks(n);
while(enum.hasMoreElements())
   System.out.println(enum.nextElement());

For a more elaborate example, the code fragment below will print the names of all nodes reached from the node named by n by following exactly two edges (if a node can be reached by two edges in more than one different way, for example e from a in the graph above, it will be printed the number of times it can be reached):

Enumeration enum1 = g.getLinks(n);
while(enum1.hasMoreElements())
   {
    String m = (String)enum1.nextElement();
    Enumeration enum2 = g.getLinks(m);
    while(enum2.hasMoreElements())
       System.out.println(enum2.nextElement());
   }

The constructor for the type Graph1 in the code above reads from a BufferedReader which could be attached to a file. So if we have a collection of graphs we want to experiment with we can store them in files and reuse them. The information in the file is assumed to be stored in a way which resembles an adjacency list. Each line in the file consists of a node name followed by zero or more nodes names. No two lines start with the same node name. Each line represents the existence of a node given by the first node name in the line, and the edges leading from that node given by the following node names which are the nodes the edges lead to. So the line

x y z

indicates there is a node x which has two outgoing links, one to y and one to z. Since we are dealing with directed graphs, there is no assumption that there are also edges from y and z to x. The graph above would be represented by:

a b c
b d e
c e
d c f
e f
f

Note there is a final blank line which is used to indicate the end of the graph. This is tested for in the constructor where there is a loop which is exited when a blank line is read.

The code for the constructor is slightly more complicated than necessary in order for the enumeration to give the connecting nodes in the order they were in the file. So when constructing a list of NCell objects referred to by nlist, first of all an initial NCell for nlist to refer to is created, then a separate local variable, ptr, is set to refer to it:

     if(toks.hasMoreTokens())
        {
         nlist = new NCell(toks.nextToken());
         ptr=nlist;
        }
     else
        nlist=null;

Then remains pointing to the start of the list, and new cells are added to its end by manipulating ptr:

     while(toks.hasMoreTokens())
        {
         ptr.next = new NCell(toks.nextToken());
         ptr = ptr.next;
        }

So from the line

w x y z

the following is created:

The code could have been the simpler:

     list = null;
     while(toks.hasMoreTokens())
        nlist = new NCell(toks.nextToken(),nlist);

but this would have resulted in nlist for the line w x y z becoming:

Note too, in the form we have given the constructor for NCell always takes just one argument and returns an NCell object whose first field is set to this argument and whose next field is set to null:

  NCell(String n)
  {
   first=n;
   next=null;
  }

In the form that would lead to a "backwards" nlist, the constructor would have to take two arguments, one to set the first field and the other to set the next field:

  NCell(String n,NCell l)
  {
   first=n;
   next=l;
  }

The ordering of the ACell objects in the adjacency list, however, is not important, so this list is constructed in the simpler way and will end up "backwards" from the order in the file. When the method getLinks is called with a string as its argument, the list of ACell objects is gone through in order to find one whose from field is equal to the argument of the method. If one is found, the list of NCell objects referred to by the ACell objects to field is used as the basis of the Enumeration object returned.

Undirected graphs

As mentioned previously, an undirected graph can be created by ensuring for every edge x-y there is also an edge y-x. The following code constructor will read from a file as above, but treat it as an undirected graph:

 public Graph2(BufferedReader file)
 throws IOException
 {
  NCell ptr=null;
  String line = file.readLine();
  adjList=null;
  while(!line.equals(""))
     {
      NCell nlist;
      StringTokenizer toks = new StringTokenizer(line);
      String first = toks.nextToken();
      if(toks.hasMoreTokens())
         {
          nlist = new NCell(toks.nextToken(),null);
          ptr=nlist;
         }
      else
         nlist=null;
      while(toks.hasMoreTokens())
         {
          ptr.next = new NCell(toks.nextToken(),null);
          ptr = ptr.next;
         }
      adjList = new ACell(first,nlist,adjList);
      line = file.readLine();
     }

  ACell aptr1,aptr2;

  aptr1 = new ACell(adjList.from,adjList.to,null);
  aptr2=aptr1;
  for(ACell aptr3=adjList.next; aptr3!=null; aptr3=aptr3.next)
     {
      aptr2.next = new ACell(aptr3.from,aptr3.to,null);
      aptr2=aptr2.next;
     }

  for(; aptr1!=null; aptr1=aptr1.next)
     for(ptr=aptr1.to; ptr!=null; ptr=ptr.next)
        {
         for(aptr2=adjList;
            aptr2!=null&&!aptr2.from.equals(ptr.first);
            aptr2=aptr2.next);
         if(aptr2==null)
            {
             NCell nlist = new NCell(aptr1.from,null);
             aptr2 = new ACell(ptr.first,nlist,adjList);
            }
         else
            aptr2.to = new NCell(aptr1.from,aptr2.to);
        }
 }

The rest of the code for class Graph2 is as for Graph1. What happens here is that an adjacency list is created aas for Graph1 but then extra NCell objects are added to represent the edges going also in the reverse direction. So with the same file as before

a b c
b d e
c e
d c f
e f
f

first the same adjacency list as above is created. Then it is modified to give:

The code involves going through each of the NCell objects in the original adjacency list and adding a new NCell object representing the egde in the reverse direction. This is done in the final part of the constructor above. But before it is done, a separate chain of ACell objects is created referring to the same list of NCell objets as the original adjaceny list. The code:

  aptr1 = new ACell(adjList.from,adjList.to,null);
  aptr2=aptr1;
  for(ACell aptr3=adjList.next; aptr3!=null; aptr3=aptr3.next)
     {
      aptr2.next = new ACell(aptr3.from,aptr3.to,null);
      aptr2=aptr2.next;
     }

does this. This is because while you are changing the adjacency list, you need to go through something which carries on referring to it as it originally was, which is what aptr1 is set to do here. After the execution of the above loop, but before actually creating new NCell objects to represent the reverse direction edges, the situation is:

Halfway through the execution of the loop going through apr1 at the point where the reverse edges for edges leading from f, e and d have been added by adding new NCell objects, the situation is:

It can be seen here that the fact that the new NCell representing the reverse of the edge from d to d is not accessible from the ACell object in the aptr1 list means an additional d-c edge is not added.

Note this additional complexity of having to create a separate copy adjacency list and go through it would not be necessary if we insisted the original file always had the nodes in alphabetical order, and the edges represented as leading from the lowest named node alphabetically to the highest named.

It is not particularly important that you remember the exact details of what is happening here. The main thing is just to illustrate that even fairly simple operations involving cells and pointers can get quite complex, and you have to be careful of unexpected interactions, such as the one here involving the d-c edge.

Matthew Huntbach

Last modified: 8th November 2001