ArrayList
which is provided as part of the Java library. As we saw, it works similarly
to an array, being a collection of components indexed by an integer
expression, where the component given by a particular index can be
accessed or replaced by another. Unlike arrays, objects of type
ArrayList
can have their number of components increased or
decreased destructively. The class ArrayList
gives a number
of operations which we might want to use on objects of this type.
What would we do if Java did not provide us with the class
ArrayList
? Suppose we wanted an indexable collection
which could increase or decrease in size. We could program our own
class which provides objects which behave in this way.
As we have seen previously, when building any code in Java it is always a good idea to have a specification first. That is, before we write the code we must know exactly what we want it to do in terms of how it interacts with any code that uses it, and then we must write the code to ensure it behaves in that manner.
To start off with, let us consider how we would write code for a class
of objects which behave just like arrays, ignoring for now the
additional requirement of flexibility in size which we have with
arrayLists. Let us call the class we are developing MyArrayList1
.
It will need to be a generic class, we have seen the use of these,
but not yet how to define one.
To declare a generic class, you follow the class header
with a list of type variables enclosed with angled brackets
<
and >
. So our class would take the form
class MyArrayList1 <T> { ... }where
T
is a type variable. Inside the class we can
use T
as a type name. When we actually create
MyArrayList1
objects, we will need to supply a base type
for each object we create. So MyArrayList1<String>
is the type "MyArrayList1
of strings",
MyArrayList1<Integer>
is the type
"MyArrayList1
of integers", and so on. The base type
replaces T
when the methods in the class are executed on
objects.
To make the objects behave just like arrays ee will need a constructor
which takes an integer expression as
an argument and returns a MyArrayList1
object
representing a collection of that length. So
MyArrayList1<Integer> a = new MyArrayList1<Integer>(n);is the equivalent of
Integer[] a = new Integer[n];Note this is different from Java's
ArrayList
type since,
as we discussed previously,
with this type there is no constructor which creates an arrayList
object already of a particular length with its elements initialised
to null
.
We will need a method which sets a component of a MyArrayList1
object indexed by a particular integer to a particular value. So
a.set(i,t);is the equivalent to
a[i] = t;We will need a method which gives a reference to the component of a
MyArrayList1
object indexed by a particular integer.
So a.get(i)
used in an expression is the
equivalent of a[i]
used in an expression (that is, anywhere
except in the assignment a[i]=...
).
We will need a method
which gives the number of items in the collection, so
a.size()
is the equivalent of a.length
.
The way these methods must work together is that once a
MyArrayList1
object has been created with
a = new MyArrayList1<T>(n)
,
any call b.size()
where b
refers to the
object which a
originally referred to (or it is a
and a
has not been reassigned to refer to another object)
must return the value n
.
Any call b.get(i)
, where b
refers to the object a
referred to the last time
a.set(i,t)
was executed, must return an alias to
the object referred to then by t
.
MyArrayList1
should work,
we need to write code that will work in this way. Inside each
MyArrayList1
object should be some structure which
is manipulated by the code for the methods described above so they
give the behaviour specified. The structure should be given by private
variables to ensure it cannot be manipulated in any way except
through the given public methods.
The private structure used to implement an array-like object could
be just an array. Since our class of objects MyArrayList1
does nothing more than arrays do, there is no advantage in using
it rather than just using arrays directly, but using an array to implement
an array-like object will be a useful starting point when we wish
to consider implementation of more flexible array-like objects.
Below is the code which you can find in the file
MyArrayList1.java
in the directory ~mmh/DCS128/code/myArrayLists
.
class MyArrayList1 <T> { private T[] array; public MyArrayList1(int n) { array = (T[]) new Object[n]; } public T get(int i) { return array[i]; } public void set(int i,T item) { array[i]=item; } public int size() { return array.length; } public String toString() { String str = "["; if(array.length>0) str=str+array[0]; for(int i=1; i<array.length; i++) str=str+","+array[i]; return str+"]"; } }In the file UseMyArrayLists1.java you will find code which makes use of
MyArrayList1
objects. You can see it is very similar to the code we saw
previously
to manipulate ArrayList
objects.
The statement
private T[] array;says that every
MyArrayList1
object has within it a
reference to an array object of base type T
. Note that
although we have used the variable name array
for this,
we could have used any name. The word array
has no
special meaning in Java code. Use of the word array
in
the methods in this class will refer, when the method is called, to
the array
variable of the MyArrayList1
object
the method is called on.
An additional method given in class
You can see that the methods MyArrayList1
is
toString
which is used to give a textual representation
of a MyArrayList1
object. It is the conventional
representation in which the components are listed in order of indexing,
separated by commas and enclosed in square brackets. Remember that in
Java, when something is "added" to a string using +
, its
own toString
method is used to get a string representation,
then the +
works to join the strings together. So
str=str+array[0]
will use whatever is the toString
method of the item in array[0]
to get a string and then
join it to str
which holds "["
. After that,
str=str+","+array[i]
will join the string representations
of the other objects in the array, with commas as separators, in each
case using the toString
method of the class of the object.
Because Java makes this automatic use of toString
, it is
a good idea whenever we write a class to add an appropriate
toString
method to it.
set
, get
and
size
are implemented by their array equivalents. The
code in the constructor needs some explaining. It would be good
if we could have array = new T[n]
to
create a new array of base type T
and length n
for the object variable array
to refer to. Unfortunately,
in the current version of Java one thing you can't use type variables
for where you can use actual types is in creating arrays. The slightly
more complex code given
array = (T[]) new Object[n]
achieves the same thing, you do not need to be concerned about
how it actually works. When you compile it, you will get a warning
about the use of "unchecked or unsafe operations". You can ignore this
warning message, so long as there are no actual errors the code will
still compile and produce the correct .class
file.
MyArrayList1
suffers from the same problem as the
direct use of arrays, that is that it does not have the flexibility
to change size. This is not so. The reason is that a variable
holding a reference to a MyArrayList1
object holds
a reference to a variable holding a reference to an array. The
difference between a reference to an array and a reference to
a reference is crucial.
We saw previously a method
which added an item to the end of the array. The previous code
was specific to arrays of integers, but we could generalise it
to make it work for arrays of any base type:
public static <T> T[] add(T[] a,T n)
{
T[] b = (T[]) new Object[a.length+1];
for(int i=0; i<a.length; i++)
b[i]=a[i];
b[a.length]=n;
return b;
}
Now if we have two variables of type Integer[]
,
array1
and array2
and the two variables are aliases,
that can be illustrated diagrammatically by:
where the numbers given in the array are just to illustrate one possibility.
Suppose we execute the statement array1=add(array1,7)
. That
will result in the situation:
No object has been changed, but a new array object has been created and
the variable array1
changed to refer to it. The variable
array2
continues to refer to what it referred to before.
Now suppose we have two variables of type
MyArrayList1<Integer>
, obj1
and obj2
and the two variables are aliases. That can be
illustrated diagrammatically by:
Suppose we changed what the variable called array
inside the MyArrayList1
object referred to by
executing array.add(array,7)
. The result would be:
So both obj1
and obj2
share in the
change because it is to an array they refer to indirectly, not
an array they refer to directly.
Using this, we can adapt MyArrayList1
to produce
a class we shall call MyArrayList2
, which can be
found in the file
MyArrayList2.java.
We have added a method called add
which works almost
identically to the method add
above, except
that instead of the array which is being changed given as an
argument, it is the array
variable of the method
the add
call is attached to. The method called add
given above is declared at static
, all the data it works on
is passed as arguments to it, and when it is called, the call is not attached to
any object. But the method called add
we give in our class
MyArrayList2
is not declared as static
because
when it is called the call has to be attached to an object and part of the
data it works with is the data in that object. We have also changed the
constructor for MyArrayList2
from the one given for
MyArrayList1
to make it like the constructor provided
for Java's built-in ArrayList
. The constructor for
MyArrayList2
creates an object representing a 0-length
flexible array, its size will be changed when the method add
is called on it to add new items. A 0-length flexible array is represented
internally by an actual array of length 0 (such a thing is permitted in
Java, and it is not the same thing as null
).
The method call
a.add(n)
where a
is a MyArrayList2
object works just like the method call a.add(n)
where
a
is an ArrayList
object, it causes
n
to be added to the end of the collection. Only
this time, because we know how MyArrayList2
is implemented,
we know what happens underneath. This is irrelevant to how the method
call is viewed outside, however. So far as the user of MyArrayList2
is concerned, it is just like ArrayList
when it comes to
the get
, set
, size
and
add
operations. Just to complete this, a second
add
method is given in MyArrayList2
,
which represents the operation we saw with ArrayList
objects of adding an item at a particular position. So whether
a
refers to an
ArrayList
or a MyArrayList2
object, the
call a.add(p,n)
causes item n
to be
added to the collection as position p
with higher indexed
items moved up one place.
Here is the complete code:
class MyArrayList2 <T>
{
private T[] array;
public MyArrayList2()
{
array = (T[]) new Object[0];
}
public T get(int i)
{
return array[i];
}
public void set(int i,T item)
{
array[i]=item;
}
public void add(T item)
{
T[] array1 = (T[]) new Object[array.length+1];
for(int i=0; i<array.length; i++)
array1[i]=array[i];
array1[array.length]=item;
array=array1;
}
public void add(int pos,T item)
{
T[] array1 = (T[]) new Object[array.length+1];
for(int i=0; i<pos; i++)
array1[i]=array[i];
array1[pos]=item;
for(int i=pos; i<array.length; i++)
array1[i+1]=array[i];
array=array1;
}
public int size()
{
return array.length;
}
public String toString()
{
String str = "[";
if(array.length>0)
str+=array[0];
for(int i=1; i<array.length; i++)
str+=","+array[i];
return str+"]";
}
}
You can see that the code for the second add
method,
the one that takes two arguments, a position and an item to be added at
that position, works in a similar way to the first add
method:
it creates a new array representing the change, and then replaces the array
in the MyArrayList2
object it is called on by the new
array. The new array is of length one greater than the original array,
items in it are copied from the original array up to the position where the
new item is to be inserted. Then the new item is put in this position, then
the remaining items from the old array are copied into the new array at a
position one greater than their position int the old array.
There is code in the file
UseMyArrayLists2.java
which demonstrates the class MyArrayList2
being used.
In the class MyArrayList3
which can be found in the file
MyArrayList3.java
(with the file
UseMyArrayLists3.java
containing code which demonstrates it)
there are additional methods added to those in
MyArrayList2
to give the other common ArrayList
operations. Removing an item from a particular position involves
creating a new array with the item at that position removed, so items
following it are copied into a position one less, then the new array
replaces the old array in the object. A reference to the item which
was removed needs to be kept (in the variable of type T
, which
is whatever the base type is, called removed
) in order to give
the same behaviour as
ArrayList
where the remove
operation with
a position as its argument has a return value, which is the item that
was removed. This gives the following code for the method which implements
this operation:
public T remove(int pos) { T removed = array[pos]; T[] array1 = (T[]) new Object[array.length-1]; for(int i=0;i<pos;i++) array1[i]=array[i]; for(int i=pos+1;i<array.length;i++) array1[i-1]=array[i]; array=array1; return removed; }
The other remove
operation on objects of type
ArrayList
is the one where the actual item to
be removed is given as the argument. In this case, if the
item does not occur in the array, there is no need to construct
a new one and change the array in the ArrayList
object.
If the item does occur, its position is found, and a new array
constructed to replace the old one as previously. In this case,
the return type of the remove
operation is
boolean
, the value true
is returned
if the item is found and removed, and false
is
returned if the item is not found. We can compare whether the
item in the array indexed by i
is equal to the item
passed as a parameter with
the variable name item
by array[i].equals(item)
.
This is because the method equals
can be called on
any object, so that we can call it on items in the array
even though in the code these items have the type variable T
as their type so we do not know what their actual type will be.
This gives the following code:
public boolean remove(T item)
{
int i;
for(i=0; i<array.length; i++)
if(array[i].equals(item))
break;
if(i==array.length)
return false;
else
{
T[] array1 = (T[]) new Object[array.length-1];
for(int j=0; j<i; j++)
array1[j]=array[j];
for(int j=i+1; j<array.length; j++)
array1[j-1]=array[j];
array=array1;
return true;
}
}
There are also methods to find the position of an item given as an
argument. The code for these follows the same pattern we saw
previously for finding
the position of an item in an array. As we had then, there are
two methods, starting search at either end of the array. Here the
one starting search at the high indexed end of the array is called
lastIndexOf
and the other indexOf
, so these
would return different results if the item occurred more than once
in the array. As we had before, the value -1
is returned
if the item does not occur in the array. Again, note that whereas before
we had static
methods, so the array being searched is
passed as a parameter, now we have object methods where the array being
searched is inside the object the method call is attached to and so not
expressed as a parameter. Here is the code:
public int indexOf(T item) { int i; for(i=0; i<array.length; i++) if(array[i].equals(item)) return i; return -1; } public int lastIndexOf(T item) { int i; for(i=array.length-1; i>=0&&!array[i].equals(item); i--) {} return i; }
For example, what might appear to the code that uses it as a flexible
array that can be written [30,25,47]
might actually be
represented by a longer array, where if array
refers to
the array, array[0]
holds 30
,
array[1]
holds 25
and
array[2]
holds 47
, and there is also a
separate integer holding 3
. It does not matter what is in
array[3]
, array[4]
and so on. This can be
represented diagrammatically as:
where
which represents the flexible array with textual representation
count
is the name of the separate integer variable,
and we have flex
as a variable referring to the
flexible array object.
With this representation, the simple add
operation which
adds an item to the end of the flexible array as it appears in the code
which uses it actually puts that item in the array component indexed by
the count
variable and increases that variable by one. So
the result of executing the statement flex.add(54)
in
terms of the above diagram is to cause it to change to:
[30,25,47,54]
.
public void add(T item) { array[count]=item; count++; }The class which implements flexible arrays in this way, which we shall call
MyArrayList4
, must indicate that objects of this
type have a variable called array
, which stores items of
the base type T
, and a variable called count
which stores an integer. The code starts off:
class MyArrayList4 <T> { private T[] array; private int count; public MyArrayList4() { array = (T[]) new Object[10]; } ...showing the constructor which initialises the object to represent a collection of 0 items. The array has capacity to store ten items, but because the variable
count
is set to 0
none of them are initially taken as data. Note that there was no
need to add the statement count=0
to the constructor,
since numerical variables in an object are automatically set to
0
if there is no code in the constructor which sets them
to anything else.
The size of the arrayList represented by the array and count is not the length of the array, but the current value of the count variable, so we have:
public int size() { return count; }The methods which test for the position of a particular item must test only in that portion of the array currently used to store data, so we have:
public int indexOf(T item) { int i; for(i=0; i<count; i++) if(array[i].equals(item)) return i; return -1; }and similarly for
lastIndexOf
.
If we want the second sort of add
method which adds an
item at a particular position, we need to "move" the items in the
internal array which occur after this position. So, starting from where
we were in the diagram above,
if we then execute flex.add(2,99)
, we will get the
situation represented by:
The following code achieves this:
public void add(int pos,T item) { for(int i=count; i>pos; i--) array[i]=array[i-1]; array[pos]=item; count++; }The loop here
for(int i=count; i>pos; i--) array[i]=array[i-1];has the effect of "moving up" the items in the array from the one initially indexed by
pos
to the last one. Note that it starts from the
highest indexed item that is counted as data in the array. It copies a
reference to that item into the next array component, so after the first time
round the loop both array[count]
and array[count-1]
will hold references to the last item in the array. As there are now two
references, the lower indexed one can have the reference from the next
component down in the array put into it, and in general each
execution of array[i]=array[i-1]
will cause array[i]
and array[i-1]
to hold references to the same item, ready for
next time round the loop the lower indexed one to be replaced by a reference
to the one before. When the loop finishes, a[pos]
and
a[pos+1]
hold references to the item whose reference was initially
just in a[pos]
, so a reference to the new item can be
put into a[pos]
and no data has been lost. Increasing the
variable count
by one indicates that the portion of the actual
array treated as data is increased by one.
Code to remove an item from a given position works in a similar way. Items in the array must be "moved down" to "fill in the gap" caused by the removal of the item. What actually happens is that the component where the item was removed is set to refer to the same item as the next indexed component. Then that component can be set to refer to the item of the following component, and so on to the last item, then the count variable is reduced by 1. This is given by the code:
public T remove(int pos) { T removed = array[pos]; for(int i=pos+1; i<count; i++) array[i-1]=array[i]; count--; return removed; }As previously, a reference to the item that was removed is kept so that it can be returned as the result of the method.
The code for the remove
method which takes the item to
be removed as an argument looks similar to the code for
this method we used before.
Here it is:
public boolean remove(T item) { int i; for(i=0; i<count; i++) if(array[i].equals(item)) break; if(i==count) return false; else { i++; for(; i<count; i++) array[i-1]=array[i]; count--; return true; } }Note, however, that it does not go through the whole array looking for the item, only up to the component before the one indexed by
count
as that is the only portion of the array counted as the data of the flexible
array it represents. If it finds the item to be removed, it "moves down"
those elements following it in the array which form part of the data in
a similar way to the other remove
method.
In this representation, we also need to alter the get
and set
methods from the previous representation, so that
you cannot get or set values beyond the current size of data. The
previous get
was:
public T get(int i) { return array[i]; }because every item in the array was counted as part of the data in the flexible array. However, in our new representation, only that part of the array up to but not including the part indexed by the count variable is counted as part of the data. We need to signal it as an error if an attempt is made to get or set components with an index which is valid for the array but not for the current state of the object viewed as a flexible array. You could signal an error by printing some sort of message, but in Java the conventional way of signalling an attempt to perform an operation that can't be done is to throw an exception. If you attempt to access
a[i]
and i
is equal to or greater than the length of the array
referenced by variable a
, an IndexOutOfBoundsException
is thrown. But we can also force a method to throw such an exception
by making it execute the statement
throw new IndexOutOfBoundsException()
. So here
is the revised version of get
:
public T get(int i) { if(i>=count) throw new IndexOutOfBoundsException(); return array[i]; }This means that even if
i
is less than the length of the
array inside an object of type MyArrayList4
, if it is
greater than or equal to the current value of count
,
an IndexOutOfBoundsException
is thrown.
You may wonder what happens if the value of the count variable goes over the size of the array. Rather than cause an error, we can just increase the size of the array in the same way we did when we just used an array to represent a flexible array. We make a new array of the increased size, copy the references in the old array into it, and replace the reference to the array in the object by a reference to the new array. If we increase the array to double the size of the previous array, we won't have to keep on increasing it every time a new item is added to the flexible array. Here is a method which doubles the size of the array inside the object:
private void expand() { T[] array1 = (T[]) new Object[array.length*2]; for(int i=0; i<count; i++) array1[i]=array[i]; array=array1; }Note it is declared as
private
. This is because it is intended
only for use inside the class MyArrayList4
by other methods,
it is not intended that code which uses objects of type
MyArrayList4
should be able to make direct use of the
expand
method. Using this method, we can change the
single-argument add
method to:
public void add(T item) { if(count==array.length) expand(); array[count]=item; count++; }So the idea is that usually when an item is added the flexible array, no new actual array in the object has to be created. Occasionally, however, a new actual array has to be created and replace the old actual array when the old atual array proves to be too small for what it required.
Taking all of this into account, the full code for
myArrayList4
, which can also be found in the file
MyArrayList4.java
is:
class MyArrayList4 <T> { private T[] array; private int count; public MyArrayList4() { array = (T[]) new Object[10]; } public T get(int i) { if(i>=count) throw new IndexOutOfBoundsException(); return array[i]; } public void set(int i,T item) { if(i>=count) throw new IndexOutOfBoundsException(); array[i]=item; } private void expand() { T[] array1 = (T[]) new Object[array.length*2]; for(int i=0; i<count; i++) array1[i]=array[i]; array=array1; } public void add(T item) { if(count==array.length) expand(); array[count]=item; count++; } public void add(int pos,T item) { if(pos>count) throw new IndexOutOfBoundsException(); if(count==array.length) expand(); for(int i=count; i>pos; i--) array[i]=array[i-1]; array[pos]=item; count++; } public T remove(int pos) { if(pos>=count) throw new IndexOutOfBoundsException(); T removed = array[pos]; for(int i=pos+1; i<count; i++) array[i-1]=array[i]; count--; return removed; } public boolean remove(T item) { int i; for(i=0; i<count; i++) if(array[i].equals(item)) break; if(i==count) return false; else { i++; for(; i<count; i++) array[i-1]=array[i]; count--; return true; } } public int size() { return count; } public int indexOf(T item) { int i; for(i=0; i<count; i++) if(array[i].equals(item)) return i; return -1; } public int lastIndexOf(T item) { int i; for(i=count-1; i>=0&&!array[i].equals(item); i--) {} return i; } public String toString() { String str = "["; if(count>0) str+=array[0]; for(int i=1; i<count; i++) str+=","+array[i]; return str+"]"; } }Note that the other methods which take an argument representing a position as well as
get
mentioned
above
have to be altered to throw an IndexOutOfBoundsException
if the position is one which is outside the section of the array
currently used to store data. These other methods are set
,
remove
with an int
argument, and
add
with two arguments. However, note that in the case
of add
with two arguments, the int
argument
representing the position at which the other argument is to be added
may be one greater than the last currently used position, since in
that case it simply adds the new item at the end in the way
add
with one argument does.
ArrayList
objects which are provided
in Java's code library. The idea is to give the appearance of an array
which is flexible, which can grow or shrink in length as items are
added and deleted. At first inside the flexible array objects was an
actual array of the same length as the flexible array being
represented. This was inefficient as it meant we had to make a complete
new array to go inside the flexible array object every time its size
was changed. So instead we considered an "array and count" representation,
where inside the flexible array object is an array and a variable saying
how much of the array is currently to be considered as the data of the
flexible array. This meant that we could add and remove items without
having to make a completely new array inside each time.
The important thing is that the code which makes use of the
flexible array objects does not need to know what is inside those
objects. The file
UseMyArrayLists4.java
makes use of the "array and count" representation, but we only know that
because it creates and uses objects of type MyArrayList4
,
and we know the code for that. Otherwise, it is identical to the file
UseMyArrayLists3.java
which makes use of the array representation.
So really we did not have to provide a new class when we changed the internal representation to one we considered more efficient. We could have just changed the code in the old class but kept its name. The code that used it would still work because nothing in it referred to the internal representation of flexible array objects.
Understanding the importance of maintaining a clear distinction between definition and implementation of some type in Java is a crucial step in becoming a good programmer. The definition described how objects of the type work in terms of how their public methods interact. The implementation is what is inside the class for the object which makes the public methods work in this way. If the implementation is changed but the definition stays the same, no code which uses objects of that type needs to be changed.
The name abstract data type is given to types which are defined
in terms of what they do as the code that uses them sees it, rather than
in terms of what the code inside them does. You are already used to
abstract data types because in this course you were shown how to use
types like ArrayList
and LispList
before
you were shown how to write code which implements them. You should not
stop thinking in those terms just because you now also know how to
implement these types.
Keeping variables and methods in one part of a program in a way so that
they can't be accessed in another is known as information hiding.
The keyword private
is used in Java to declare a variable,
or a method (or a nested class) which can only be referred to inside
the class it is declared in. Note, variables declared inside methods
are automatically private since they can only be referred to inside
the method itself. But a variable or method (or nested class) which
is declared inside a class but outside a method and not declared
as private
can be accessed by code outside that class
so long as it is attached by .
to a reference to an
object of that class (or to the class name if it is also declared as
static
). Strictly, declaration as public
ensures possible access in all other code which accesses the class, there is
also an intermediate declaration as protected
, and
the use of none of these means a variable, method or nested class
which can be accessed in any class in the same package (but we are
not considering Java packages in this course). An abstract data type
should be implemented with only the methods which define the tuype being
declared as public
, everything else being declared as
private
.
The reason for information hiding is that the less one part of the code can access information declared and used in another, the less chance there is of something going wrong due to unexpected interaction between the two parts. Also, code frequently gets changed as a system is updated. If we can be sure that only a few public methods of a class can be accessed elsewhere, we know we can change everything else in the class without concern over how it will affect the rest of the program. A good example is if we change the data structure inside a class which provides an abstract data type, perhaps because we have discovered a more effciient representation. We need not be concerned that changing it will cause problems, because we know there is no other code which makes direct use of the data structure.
Last modified: 20 February 2006