As before, for simplicity we will consider collections of integers rather than some more complex object type, but the principles discussed here could extend to collections of things other than integers. What we are discussing here is a collection which is like the mathematical idea of a set.
Here is a program which maintains a set of integers, with commands that you can use to add an integer to the set, remove an integer from the set, check whether an integer is in the set, or print the set:
import java.io.*; class UseIntSets1 { // Simplistic "single method" set maintenance program public static void main(String[] args) throws IOException { final int MAX = 100; int[] array = new int[MAX]; int i,n=0,count=0; char ch; String line; BufferedReader in = new BufferedReader(new InputStreamReader(System.in)); do { System.out.print(": "); line = in.readLine(); ch = line.charAt(0); if(line.length()>1) n = Integer.parseInt(line.substring(1).trim()); switch(ch) { case 'q' : break; case 'd' : for(i=0; i<count&&array[i]!=n; i++) {} if(i<count) { count--; array[i]=array[count]; } break; case 'a' : for(i=0; i<count&&array[i]!=n; i++) {} if(i==count) { array[count] = n; count++; } break; case 'm' : for(i=0; i<count&&array[i]!=n; i++) {} if(i==count) System.out.println("Not a member"); else System.out.println("Is a member"); break; case 'p': System.out.print("{"); if(count>0) { System.out.print(array[0]); for(i=1;i<count;i++) System.out.print(","+array[i]); } System.out.println("}"); break; default: System.out.print("d - delete, a - add, m - member,"); System.out.println(" p - print, q - quit"); break; } } while(ch!='q'); } }There are simple one-letter commands:
a
to add an
integer to the set, d
to delete one, m
to check if an integer is currently a member of the set (the program
causes a message to be printed saying whether it is or isn't),
p
to print the set, and q
to quit. The
first three commands must all be followed by an integer.
This program should not be taken as an example of a good program. Firstly, it is not very robust. If commands are not given in exactly the right format, the program will terminate printing incomprehensible error messages on the screen. In program intended for real use, that should never happen. If the user enters anything not in the correct format the program should always be able to handle it, printing an appropriate error message and carrying on.
But secondly, the program is poorly structured. It is just one long
single method. This means it is difficult to see how the code works,
and you cannot easily re-use it in other programs. Programs become
clearer and more easy to re-use and modify when the code is broken up
into separate classes and methods. If this fairly trivial program is
already too complicated to understand in one piece, what about
programs of a much greater size?
Sets represented by an array and count
The program above uses a form of representation common in languages
like Java where the size of an array is fixed when it is created,
and it is required to store a set of data which may vary in size during
the course of execution of the program.
An array is created of the maximum size thought necessary, but only
part of that array is used. Here the array is called array
while the variable count
holds the number of cells in the
array currently in use. At any time during execution, the cells
a[0]
up to and including a[count-1]
store the
data in the set, while the contents of the rest of the array are
irrelevant.
The size of the array is set by a final
integer variable called MAX
. The final
notation means the value of this integer can never be changed during
program execution, so it is really a constant. But it is better to declare
the maximum just once like this and after that always use the name given
to the constant than use the actual figure in the program. In the program
above, MAX
is set to 100, but if we were careful always to
refer to MAX
and never otherwise directly to 100, we could
change that to some other value without having to change the rest of the
program if circumstances should change, maybe we discover we may need to
store sets of more than 100 elements. It is a convention in Java that
constants are given names which are all capital letters. By "convention"
we mean a standard which Java programmers usually keep to, but which is
not actually part of the Java language.
Note that we did not put any code in which checks whether the maximum size of the array is being exceeded. If the program was run and an attempt was made to put more than 100 integers in the set, it would halt with an error message that would be meaningless to someone who was not familiar with the code of the program. Again, this is done only in order to make the program short for a simple illustration. No program that was intended for real use should just crash like this, although a common source of bugs in real-life programs is where bounds that the programmer assumed would never be exceeded, are exceeded in practice, but as the programmer thought this would never happen the error is not handled appropriately.
In order to give the program a better structure, we could separate out the sections of code that manipulate the array into separate methods:
import java.io.*; class UseIntSets2 { // Set maintenance program with methods in same class public static void main(String[] args) throws IOException { final int MAX = 100; int[] array = new int[MAX]; int n=0,count=0; char ch; String line; BufferedReader in = new BufferedReader(new InputStreamReader(System.in)); do { System.out.print(": "); line = in.readLine(); ch = line.charAt(0); if(line.length()>1) n = Integer.parseInt(line.substring(1).trim()); switch(ch) { case 'q' : break; case 'd' : count = delete(n,array,count); break; case 'a' : count = add(n,array,count); break; case 'm' : if(member(n,array,count)) System.out.println("Is a member"); else System.out.println("Not a member"); break; case 'p': print(array,count); break; default: System.out.print("d - delete, a - add, m - member,"); System.out.println(" p - print, q - quit"); break; } } while(ch!='q'); } public static int delete(int n, int[] array, int count) { int i; for(i=0; i<count&&array[i]!=n; i++) {} if(i<count) { array[i]=array[count-1]; return count-1; } return count; } public static int add(int n, int[] array, int count) { int i; for(i=0; i<count&&array[i]!=n; i++) {} if(i==count) { array[count] = n; return count+1; } return count; } public static boolean member(int n, int[] array, int count) { int i; for(i=0; i<count&&array[i]!=n; i++) {} return i<count; } public static void print(int[] array, int count) { int i; System.out.print("{"); if(count>0) { System.out.print(array[0]); for(i=1;i<count;i++) System.out.print(","+array[i]); } System.out.println("}"); } }Now it can be seen more easily how the data structure is handled to give the desired effect. The method
delete
changes
the array to represent deleting an integer from the set.
The place where the integer to be deleted is found is set to the
integer stored at the end of the used section of the array. The
count is then reduced by one, so that the initial copy of this integer
is now not counted as part of the data. The effect is to change the
order in which the data is stored, but this does not matter as there
is no concept of ordering in sets, and none of the methods to implement
them here rely on the data being ordered in any way.
Note that if the element to be deleted is not found, the array
is left unchanged. Special cases like this must always be considered,
in this case it was a design decision that deleting an integer from a set
it does not occur in is a permissible operation (so there is no error
handling) and leaves the set unchanged. The method delete
needs to return an integer giving the new size of the set, as this
could be one less than the size before the deletion, or the same size
if the item being deleted was not found.
Similarly, when an integer is added to the set, if the integers is
already present no change is made. So it is necessary to loop
through the array to see if it is present. If we go through it all
up to the point where the valid data finishes, and don't find the integer
already there, we put the integer in the next available free cell in the
array, and return a new value for count
to indicate the
portion of the array used for storing data has increased by 1 in size.
There is a method called member
which tells us whether an
integer is in the set. It returns a boolean (i.e. true
or
false
). This method cannot change the representation of the
set. There is a method called print
which actually prints
a representation of the set onto the standard output (which would be the
window in which the call to the program is made normally), but does not
return anything.
All these methods use a body-less for-loop written:
for(i=0; i<count&&array[i]!=n; i++) {}to find the position of the integer
n
in the array
array
. It first does the initialisation (i=0
) and the
test, in this case i<count&&array[i]!=n
, and if this is
true repeatedly does the test and the update (the update is just
i++
) until the test becomes false. The body of a for-loop
is the statement following the header, where {
and
}
are used to combine several statements into one. But
{}
indicates a statement made up out of no statements,
and in this case the {}
is the body of the for-loop.
Another way of writing the same thing is:
for(i=0; i<count&&array[i]!=n; i++);It means exactly the same, but it is perhaps easier to miss the semi-colon at the end and think, wrongly, that the following statement is meant to be in the for loop.
Note that the for-loop is not
for(int i=0; i<count&&array[i]!=n; i++) {}with the variable
i
declared inside the for-loop. If it
were written like this, the variable i
would not be
accessible outside the for-loop, but we need to know its value
once the for loop has finished. If it is equal to the variable
count
it means it has been increased to cover each of the values from
0 going up by 1 to count
, and at no point was the
second element of the test array[i]!=n
false. That
means that the integer n
does not occur in the portion
of the array currently being used to store data.
Otherwise, if the loop terminates because array[i]!=n
is false before i<count
is false, it means
i
is at the value where a[i]
is equal
to n
, so i
gives the position of the
variable in n
in the array. Note that array
is not a key-word in Java (as it is, for example, in Pascal). We
just happen to have chosen to name a variable by this name here.
delete
or add
methods. There is nothing stopping
the programmer from changing what is stored in the array, so that the
assumptions made in the methods (that any integer in the set is stored
exactly once in the portion of the array up to and including the cell
index by count-1
) may not hold.
It was for this reason that the concept of an abstract data type was introduced. In an abstract data type, the way a structure is represented, in this case a set represented by an array and a count, is kept separate from the code which uses it. The code which uses it accesses it only by calling the methods which provide the desired operations. Here is the new code for the main program:
import java.io.*; class UseIntSets3 { // Set maintenance program with separate set class public static void main(String[] args) throws IOException { int n=0; char ch; String line; BufferedReader in = new BufferedReader(new InputStreamReader(System.in)); IntSet1 mySet = new IntSet1(); do { System.out.print(": "); line = in.readLine(); ch = line.charAt(0); if(line.length()>1) n = Integer.parseInt(line.substring(1).trim()); switch(ch) { case 'q' : break; case 'd' : mySet.delete(n); break; case 'a' : mySet.add(n); break; case 'm' : if(mySet.member(n)) System.out.println("Is a member"); else System.out.println("Not a member"); break; case 'p': mySet.print(); break; default: System.out.print("d - delete, a - add, m - member,"); System.out.println(" p - print, q - quit"); break; } } while(ch!='q'); } }It creates an object referred to by variable
mySet
of a
type we have called IntSet1
. But the details of how a
IntSet1
object works are kept in another file. The
person who wrote the code above which uses it need have no knowledge
of how a IntSet1
object works internally, and cannot alter
the way an IntSet1
object is stored in any way except
calling the methods that are provided as public methods on an
IntSet1
object. The code executed when a method is
called on an intSet1
object does all the work in
making the necessary changes to that object to represent adding or
deleting an integer.
Here is the code for IntSet1
:
class IntSet1
{
// Set class, sets stored as unordered partly-filled arrays
private int[] array;
private int count;
final static int MAX=100;
public IntSet1()
{
array = new int[MAX];
}
public void delete(int n)
{
int i;
for(i=0; i<count&&array[i]!=n; i++) {}
if(i<count)
{
count--;
array[i]=array[count];
}
}
public void add(int n)
{
int i;
for(i=0; i<count&&array[i]!=n; i++) {}
if(i==count)
{
array[count] = n;
count++;
}
}
public boolean member(int n)
{
int i;
for(i=0; i<count&&array[i]!=n; i++) {}
return i<count;
}
public void print()
{
int i;
System.out.print("{");
if(count>0)
{
System.out.print(array[0]);
for(i=1;i<count;i++)
System.out.print(","+array[i]);
}
System.out.println("}");
}
}
It will generally be kept in a separate file called IntSet1.java
.
The fields array
and count
are declared as
private
. This means that only the code inside the class
itself can access them. Each IntSet1
object will have
its own array
and count
field. The field
MAX
, however, is declared as static
, meaning
there is just one variable called MAX
which is shared by
all IntSet1
objects.
If s
is a variable
of type IntSet1
, it can be set to refer to a new
IntSet1
object by s = new IntSet1()
. The
method with signature public IntSet1()
, known as
the constructor for IntSet1
says what happens
when a new IntSet1
object is created. Its field
array
is set to refer to a new array of integers of
length MAX
. There is no need to have a statement
count = 0
, because if fields are not mentioned in
the constructor, they are set automatically to zero if they are
numerical fields, or to null
if they are object
fields.
Following this, s.add(20)
will add the number 20 to the
set represented by the object referred to by s
, and so on
for any integer or expression which evaluates to an integer. Similarly,
s.delete(10)
will remove 10 from the set represented by the
object referred to by s
. These methods have void
as their return type, so no value is returned from them, but they change
the value of the set represented by the object referred to by s
.
The method print
also has return type void
, but
s.print()
does not cause any change in the object referred to
by s
, rather it causes a representation of that object to be
printed.
To the person writing a method which makes use of these methods on
sets, IntSet1
might as well be a standard part of the
Java language. There is a clear separation between the code that
implements IntSet1
objects and the code which uses these
objects. This is a key point for structuring programs both so they are
easy to understand and easy to modify. Making the array and count which
represent a set inaccessible to any code except that for the methods
which give the set operations is an example of the principle known as
information hiding. In any situation, it is easier to understand
the situation if irrelevant information is not mixed up with relevant
information. Lack of access to the array and count mean they can't be
changed in ways that don't fit in with how they represent sets, and in
general if we don't have direct access to things we don't need to change
directly, we will avoid errors caused by making changes to them in
unexpected ways.
An important thing here is that we think of sets, when we write programs
that make use of them, not in terms of the way they are represented on the
computer, but in terms of the operations
possible on them, add
, delete
, member
and print
. Implementing sets and providing methods which
give these operations is a separate issue. It is as if we have built a wall
between the code that uses sets and the code that implements them, with
the public methods of the sets being the gate in the wall.
As a minor point, it is generally not a good idea to have a method
which directly prints a representation of an object. That is too
inflexible as it can only be used in a limited way. In general, there
should be a method which returns the representation of an object in the form
of a string. Here is such a method appropriate for the class
IntSet1
:
public String toString() { int i; String str="{"; if(count>0) { str+=array[0]; for(i=1;i<count;i++) str+=","+array[i]; } return str+"}"; }In fact Java makes a special use of the method called
toString
with no arguments in any class. If an attempt is made to print an object,
the method toString
is called automatically on it even if
it is not mentioned. So with the above method, we can alter the lines
implementing the 'p'
command in the switch statement of the
program that uses sets to:
case 'p': System.out.println(mySet); break;Calling
System.out.println(mySet)
automatically becomes
System.out.println(mySet.toString())
.
Last modified: 27 January 2004