The Abstract Data Type List

Linked lists versus abstract lists

In previous notes, we introduced the idea of a linked list as a data structure that could implement an abstract data type representing sets.

An abstract data type is defined in terms of the operations that it provides. We saw large numbers of ways in which the abstract data type Set could be implemented. So long as the class provided us with the methods add, delete and member which worked according to the axioms, it was a correct implementation of the abstract idea of a set, and it did not matter what happened underneath.

The linked list data structure, however, was defined directly in terms of the class that provided it. We could directly access and manipulate the fields within the objects that made up a linked list. There was no distinction between what you could do with it and how it was represented in the computer.

Now we will look at an abstract data type list. It is important that you distinguish between a list as an abstract data type and a linked list, even though they apear to be very similar. In fact the abstract data type list is best implemented using the linked list data structure. As with sets represented by linked lists, lists represented by linked lists will be objects with a single field of type Cell where Cell objects are linked together to make linked lists. Also, as before, the class Cell will be an inner class within the class providing the abstract data type. But the abstract data type List will provide the public methods head, tail, cons, and isEmpty rather than the set methods add, delete and member.

Here is the code for the type List:

class List
{
 Cell llist;

 public static List empty()
 {
  return new List(null);
 }

 private List(Cell l)
 {
  llist=l;
 }

 public int head()
 {
  return llist.first;
 }

 public List tail()
 {
  return new List(llist.next);
 }

 public List cons(int n)
 {
  return new List(new Cell(n, llist));
 }

 public boolean isEmpty()
 {
  return llist==null;
 }

 private static class Cell
 {
  int first;
  Cell next;

  Cell(int f,Cell n)
  {
   first=f;
   next=n;
  }
 }
}
We give the field name as llist rather than list to make clear the distinction between it and the type List though this was not necessary as Java is a case sensitive language, meaning it treats a lower case letter as a different letter from the same letter in upper case, so list and List are two completely separate variable names. Note it is a Java convention that class names begin with an upper-case letter, and variable, method and field names with a lower-case letter.

The method head returns the first element of a list, the method tail returns the list of everything but the first element. These seem very similar to the first and next fields of Cell objects that form linked lists. Indeed, as you can see in the code, head just returns the value of the first field of the linked list representing the abstract list, while tail returns an abstract list whose internal representation is the value of the next field.

However, note that these methods only return values. You cannot have an assignment whose left-hand side is a method call. If we have a list l and an integer n, it makes sense to execute n=l.head(), but l.head()=n is not valid Java. If we had a linked list referred to by ll, however, we can have both n=ll.first and ll.first=n. So there is no code here which changes the first or next field of a Cell object representing a List object. This means that List objects are immutable.

There is no public constructor for List objects, but there are three methods that act as constructors, because they cause new List objects to be created which they return, and in fact themselves call the private constructor. The first of these is the method empty. When it is called, the call is not attached to any existing List object, so the method is declared as static, and is called attached to the class name, i.e. List.empty().

The second method which effectively acts as a constructor is cons. It is called attached to a List object with an integer (assuming we are dealing with lists of integers) as its argument, and returns an object representing the original list with the new integer added to the front. It is important to note that this does not change the List object that cons was called on, rather it creates a new List object while leaving the old one unchanged. It works therefore like the constructive operations on lists we saw before. Suppose we have two variables of type List called l1 and l2, and l1 has been set to represent the list [4,7,2]. Then calling the assignment l2=l1.cons(5) will cause l2 to represent the list [5,4,7,2], but l1 will remain representing [4,7,2]. The actual structures that will be set up here are:

The method tail also acts as a constructor because it constructs a new object whose list is the list of everything but the first element from the object it was called on. Again, it makes no actual changes to the list. For example, if we had a variable l3 of type List and executed the assignment l3=l1.tail() in the context of the above situation, we would get the situation:

Note that although we have slipped into labelling objects directly with the names of the variables referring to them, strictly we should remember the variables are a separate things from the objects. If after the above situation we were to execute l1=l3, we could say that "l1 and l3 are two names for the same object", which could be illustrated by:

but strictly what we have are two variables l1 and l3 containing references to the same object, which is more precisely illustrated by:

Changing what l1 refers to does not make any changes to what l2 refers to. Whereas with linked lists we could change l2 as a side-effect of directly changing the object l1 or l3 refers to, there is no way of doing such a thing in this representation. This makes it much safer to use, because we do not have to worry about unexpected side-effects caused by data we didn't realise was shared.

List axioms

The behaviour of lists can be well summarised through axioms. The following axioms complete describe the correct behaviour of lists:

1. List.empty().isEmpty == true
2. l.cons(i).isEmpty == false
3. l.cons(i).head() == i
4. l.cons(i).tail() == l

for any list l and list element i.

Although the natural implementation of lists is using linked lists, any other implementation of lists is a correct implementation so long as it is such that the above axioms always hold.

Using lists

To make practical use of lists, it is useful to add to the class List a method for giving the string equivalent of a list, which is called toString so that when an attempt is made to print a list or join a list to a string using +, the conversion will be done automatically, and also a method we shall call fromString which takes a string and gives the List object it represents. Here is code for these two methods:
public String toString()
 {
  if(llist==null)
     return "[]";
  else
     {
      String str="["+llist.first;
      for(Cell ptr=llist.next; ptr!=null; ptr=ptr.next)
         str+=","+ptr.first;
      return str+"]";
     }
 }

 public static List fromString(String str)
 {
  str=str.trim();
  if(str.charAt(0)!='['||str.charAt(str.length()-1)!=']')
     return empty();
  str = str.substring(1,str.length()-1);
  StringTokenizer toks = new StringTokenizer(str,",");
  try {
       return new List(make(toks));
      }
  catch(NumberFormatException e)
      {
       return empty();
      }
 }

 private static Cell make(StringTokenizer toks)
 {
  if(toks.hasMoreTokens())
     {
      String num = toks.nextToken().trim();
      return new Cell(Integer.parseInt(num),make(toks));
     }
  else
     return null;
 }
The method fromString makes use of a private supporting method called make as it breaks the string up using a string tokenizer. Since these methods make direct reference to the private Cell data structure of List objects, they have to be incorporated in the class List, they could not be put in some other class that uses List objects. On the whole, however, if you have a well-defined abstract data type you should never add methods to it to perform extra functions by manipulating its underlying data structure. Rather, you should write methods in other classes which make use only of the public methods of the abstract data type.

Let us consider an example of a class which gives us a couple of methods for joining lists together, together with a main method which allows these methods to be tested:

import java.io.*;

class UseLists1
{
 public static void main(String[] args)
 throws IOException
 {
  BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
  String str;
  List l1, l2, l3;
  System.out.println("Enter list 1: ");
  str = in.readLine();
  l1 = List.fromString(str);
  System.out.println("Enter list 2: ");
  str = in.readLine();
  l2 = List.fromString(str);
  l3 = append(l1,l2);
  System.out.println("Appending the first to the second gives: ");
  System.out.println(l3);
  l3 = revAppend(l1,l2);
  System.out.println("Reversed appending the first to the second gives: ");
  System.out.println(l3);
  System.out.println("The lists are still: ");
  System.out.println(l1);
  System.out.println(l2);
 }

 public static List append(List l1,List l2)
 {
  if(l1.isEmpty())
     return l2;
  else
     return append(l1.tail(),l2).cons(l1.head());
 }

 public static List revAppend(List l1,List l2)
 {
  while(!l1.isEmpty())
     {
      l2=l2.cons(l1.head());
      l1=l1.tail();
     }
  return l2;
 }
}
The method append takes two lists and returns the list obtained by joining them together, so if l1 represents [2,4,7] and l2 represents [5,2,9] then append(l1,l2) will represent [2,4,7,5,2,9].

As lists are a recursive data type, it often makes sense to write methods dealing with them using recursion. If we try to think of what we mean by appending two lists, we can think first of what it means to append one list to another when the first is the empty list. Clearly the result will be a list equal to the second list, so that accounts for the lines:

  if(l1.isEmpty())
     return l2;
in the code for append. Otherwise, the append of one list to another is the same as the append of the tail of the first list to the second list, with the head of the first list joined onto the result. That is what the lines:
  else
     return append(l1.tail(),l2).cons(l1.head());
give us. For example, the append of [2,4,7] to [5,2,9] is equal to 2 joined to the front of the append of [4,7] to [5,2,9]. If it is not quite clear that this is what is being done in the code above, we could break that code into separate statements instead of having one statement involving several method calls. The following is exactly equivalent to the above:
  else
     {
      List t = l1.tail();
      List r = append(t,l2);
      int h = l1.head();
      List c = r.cons(h);
      return c;
     }
in fact it's likely the Java compiler would compile both to the same Java byte code. You can see that if you took out the separate variable t above and put the call l1.tail() in its place, and then did the same for all the other variables introduced just to hold the result of one method call, you would end up with the complicated return statement above.

The method revAppend illustrates a phenomenon which is often seen with novice programmers attempting to program with lists but who aren't happy using recursion so use iteration instead. It goes through the first list by iteration joining elements of it to the second. So with our example, on the first time round the loop it will take the head off l1l2, making l1 represent [4,7] and l2 represent [2,5,2,9]. The next time round the loop it will set l1 to represent [7] and l2 to represent [4,2,5,2,9]. The time after that it sets l1 to represent [ ] and l2 to represent [7,4,2,5,2,9], and now l1 represents the empty list, the loop is exited and what l2 refers to is returned as the result of the method. As you can see, it has returned the result of joining the reverse of l1 to l2. If you use iteration with lists, you have to be careful not to end up reversing the result you want.

Although revAppend changes what variables l1 and l2 refer to, it only changes what its local variables l1 and l2 refer to. In the main method above, the variables l1 and l2 are initialised to refer to the same lists as the same-named variables in the main method. But when l1 and l2 are changed to refer to something else inside the revAppend method, this does not affect what the main method's l1 and l2 refer to. This is demonstrated by the main method printing its l1 and l2 objects after execution of the call to the revAppend method.


Matthew Huntbach

Last modified: 1 October 2001