Constructive Recursion with Linked Structures: more constructive set implementations

Constructive sets with linked lists

With implementations of sets using arrays, we contrasted destructive implementations where methods actually changed the data within the set object, with constructive implementations where methods did not cause any change to the objects they are called on but instead returned new objects representing the change. As we said then, if an object has no destructive methods that can be called on it, it is termed immutable.

Linked structures can be used to implement both destructive and constructive versions of sets. In the previous set of notes we considered the use of recursion in destructive manipulation of linked lists and linked tree structures, now we consider the use of recursion in constructive operations on lists and trees representing sets.

Here is a complete class which implements sets using unordered linked lists providing constructive methods add and delete:

class IntSet17
{
// Constructive Set class, 
// implemented using unordered linked lists

 private Cell list;

 public IntSet17()
 {
  list = null;
 }

 private IntSet17(Cell l)
 {
  list = l;
 }

 public IntSet17 delete(int n)
 {
  Cell list1 = delete(n,list);
  return new IntSet17(list1);
 }

 private static Cell delete(int n,Cell l)
 {
  if(l==null)
     return null;
  else if(l.first==n)
     return l.next;
  else
     return new Cell(l.first,delete(n,l.next));
 }

 public IntSet17 add(int n)
 {
  Cell ptr;
  for(ptr=list; ptr!=null&&ptr.first!=n; ptr=ptr.next) {}
  if(ptr==null)
     {
      Cell list1 = new Cell(n,list);
      return new IntSet17(list1);
     }
  else
     return this;
 }

 public boolean member(int n)
 {
  Cell ptr;
  for(ptr=list; ptr!=null&&ptr.first!=n; ptr=ptr.next) {}
  return (ptr!=null);
 }

 private static class Cell
 {
  int first;
  Cell next;

  Cell(int f,Cell n)
  {
   first=f;
   next=n;
  }
 }

 public String toString()
 {
  String str="{";
  if(list!=null)
     {
      str+=list.first;
      for(Cell ptr=list.next; ptr!=null; ptr=ptr.next)
          str+=","+ptr.first;
     }
  return str+"}";
 }

}

In this case, the only method that uses recursion is delete. There is nothing wrong with mixing recursive and iterative methods in a class, so long as they have the desired effect. Here the member and toString methods are exactly as we saw them with our first unordered linked list implementation of sets. We also have a nested class called Cell just as before.

To provide constructive methods, we use a private constructor which takes as its argument the internal representation of a set, and returns a new set object whose data structure inside is this representation. This is the same thing we did when we wanted to create a copy method for sets implemented with linked lists previously. It is also what we did with constructive sets implemented with arrays previously. There we had a private constructor with an array as its argument, while the public constructor had no arguments. Here we have a private constructor with a linked list as its argument, while the public constructor has no arguments. It is necessary to write this public no-argument constructor because the default no-argument constructor is only provided if there is no other constructor, which isn't the case here due to the private constructor.

As with the destructive implementation using linked lists, the method add here first tests whether the number being added is already in the list by running a pointer down the list using the loop:

for(ptr=list; ptr!=null&&ptr.first!=n; ptr=ptr.next) {}

There are two ways this loop can be exited, firstly the pointer becomes null or secondly the pointer points to a cell which contains the number being added. In the first case the number is not already in the list. The method then returns a new object with a list created by adding the new number to the front of the list of the old object. The two objects shared the data of the list of the original object, but this does not cause a problem because all the list methods are constructive. For example, if s1 represents the set {3,8,5}, the following diagram represents one way of storing it:

Then after executing:

s2 = s1.add(6)

we get the following situation:

The figure illustrates the fact that s1 and s2 each have their own field called list which is declared as private so cannot be accessed except within the class for s1 and s2. As you can see, although a new object is created for s2 it shares some of the cells in its list with the cells for the list of the object s1 refers to.

Suppose we now execute

s3 = s2.delete(8);

This results in a situation which can be represented as follows

The circle is used here to mean "the result of calling the static method delete", with 8 and the link out of the circle as its arguments. The deletion of 8 from the list [6,3,8,5] first checks that the list is not null then checks that the first field of its first cell is not 8. As it is not, it returns a new cell whose first field is 6 and whose next field is the result of deleting 8 from [3,8,5] giving:

And this recursive deletion gives a new cell whose first field is 3 and whose next field is the result of deleting 8 from the list [8,5], resulting in:

In this case, we have a call to delete where l.first is equal to n, so we have reached a base case, and the final situation is:

As you can see, the list for s3 is [6,3,5], the list for s2 is [6,3,8,5] and the list for s1 is [3,8,5]. So deleting 8 from the list to give the list for s3 did not affect the list for s1 or s2. The list for s2 was copied down to the point where the change was made. In fact there is no code here that can change an object's list once that object has been created, hence the objects are immutable. It is, however, possible to assign a variable which referred to one object to refer to another, for example, if we executed s1 = s3, then s1 would have the list [6,3,5]. That is why although for short we may say "the object s1" what we really mean is "the object referred to by s1", since making a variable refer to another object is not the same as changing an object, and in general a variable is a reference to an object and not the object itself. In the language C++, a distinction is made between variables which hold data and variables which refer to space to hold data, but in Java all object variables are of the second sort. Or, to put it another way, in Java all non-primitive variables are pointer variables.

Constructive sets with linked ordered trees

We can use a similar technique to that used above to get a constructive set implementation using ordered trees. Here is the complete code for it:

class IntSet18
{
// Constructive Set class, sets stored as ordered trees

 private Cell tree;

 public IntSet18()
 {
  tree=null;
 }

 private IntSet18(Cell t)
 {
  tree=t;
 }

 public IntSet18 delete(int n)
 {
  return new IntSet18(delete(n,tree));
 }

 private static Cell delete(int n,Cell t)
 {
  if(t==null)
     return null;
  else if(n<t.item)
     return new Cell(t.item,delete(n,t.left),t.right);
  else if(n>t.item)
     return new Cell(t.item,t.left,delete(n,t.right));
  else if(t.left==null)
     return t.right;
  else if(t.right==null)
     return t.left;
  else
     {
      int m=rightmost(t.left);
      Cell newleft=removeRightmost(t.left);
      return new Cell(m,newleft,t.right);
     }
 }

 private static int rightmost(Cell t)
 {
  if(t.right==null)
     return t.item;
  else
     return rightmost(t.right);
 }

 private static Cell removeRightmost(Cell t)
 {
  if(t.right==null)
     return t.left;
  else
     return new Cell(t.item,t.left,removeRightmost(t.right));
 }

 public IntSet18 add(int n)
 {
  return new IntSet18(add(n,tree));
 }

 private static Cell add(int n,Cell t)
 {
  if(t==null)
     return new Cell(n,null,null);
  else if(n==t.item)
     return t;
  else if(n<t.item)
     return new Cell(t.item,add(n,t.left),t.right);
  else
     return new Cell(t.item,t.left,add(n,t.right));
 }

 public boolean member(int n)
 {
  return member(n,tree);
 }

 private static boolean member(int n,Cell t)
 {
  if(t==null)
     return false;
  else if(t.item==n)
     return true;
  else if(n<t.item)
     return member(n,t.left);
  else
     return member(n,t.right);
 }

 public String toString()
 {
  return "{"+toString(tree)+"}";
 }

 private static String toString(Cell tree)
 {
  if(tree==null)
     return "";
  else if(tree.left==null&&tree.right==null)
     return ""+tree.item;
  else if(tree.left==null)
     return tree.item+","+toString(tree.right);
  else if(tree.right==null)
     return toString(tree.left)+","+tree.item;
  else
     return toString(tree.left)+","+tree.item+","+toString(tree.right);
 }

 private static class Cell
 {
  int item;
  Cell left,right;
  
  Cell(int n,Cell l,Cell r)
  {
   item=n;
   left=l;
   right=r;
  }
 }
}

If we consider the method add, we can see it uses a private method to return the tree representing the tree of the object the original add call was made on with the addition of the integer that was the argument of this call. This tree is then used to create a new set object. If the initial tree is null, the tree returned is the tree whose node value is the value being added and whose left and right subtrees are null. If the node value of the initial tree is equal to the value being added, then a reference to this tree is returned, since no change is made, this will result in the tree being shared. Otherwise a new tree is constructed whose node value is the node value of the original tree. If the item being added is less than this node value, then the left branch of the new tree is the tree resulting from adding the new value to the left branch of the original tree, but the right branch is a reference to the right branch of the original tree, since this can be shared as no change is made to it. A similar thing happens if the item being inserted is greater than the node value.

Let us consider an example of this. Suppose we execute

s2 = s1.add(9);

where s1 represents the set {1,3,4,5,7,8,10,14,16,20} with the following tree:

Then the new object created for s2 to refer to will have as its data structure the tree resulting from the call of the private static method add with arguments 9 and the tree from s1, which can be represented by:

As 9 is less than 10, the result is a new tree whose node value is 10, whose right branch is the right branch from the original tree, and whose left branch is the result of adding 9 to the left branch of the original tree, which can be shown by:

Now as 9 is greater than 5, the result of this recursive call to the static add method is a tree whose node value is 5, whose left branch is the original left branch of its argument tree, and whose right branch is the result of adding 9 to the right branch of its argument tree, which can be shown by:

And as 9 is greater than 8, the result of this next recursive call is a tree whose node value is 8, whose left branch is the left branch of the tree that was the call's argument and whose right branch is the result of adding 9 to the right branch of the tree taht was the call's argument:

But adding 9 to null just gives us the tree whose node value is 9 and whose left and right branches are null, in other words the leaf containing 9. So the final situation is:

As you can see, the only cells that were actually copied are those on the route to the point at which the new cell was added. All other cells are shared. But the tree in the object referred to by s1 is unchanged, and continues to represent the set {1,3,4,5,7,8,10,14,16,20}, while the tree in the object s2 represents the set {1,3,4,5,7,8,9,10,14,16,20}

The delete method works in a similar way, just copying those cells on the route to the point at which the tree is changed, making shared references to the others. In this case, the operations of finding the rightmost element of a tree and removing the rightmost element of a tree are split into two methods. The removeRightmost method has to be constructive rather than the previous destructive method. So it has to return a new tree which is the tree that would be given by removing the rightmost element of the arguemnt tree. If a tree has no right branch, then the tree which is the result of removing its rightmost element is its left branch. Otherwise the tree which is the result of removing the rightmost element of a tree passed as an argument is the tree whose node value is equal to the node value of the argument tree, whose left branch is the left branch of the argument tree, and whose right branch is the result of removing the rightmost element from the right branch of the argument tree, which can obviously be obtained by a recursive call of removeRightmost.

Matthew Huntbach

Last modified: 18 February 2003