ARRAYS ARE THE MOST BASIC AND THE MOST IMPORTANT type of data structure, and techniques for processing arrays are among the most important programming techniques you can learn. Two fundamental array processing techniques -- searching and sorting -- will be covered in Section 4. This section introduces some of the basic ideas of array processing in general.
In many cases, processing an array means applying the same operation to each item in the array. This is commonly done with a for loop. A loop for processing all the items in an array A has the form:
// do any necessary initialization for (int i = 0; i < A.Length; i++) { . . . // process A[i] }
Suppose, for example, that A is an array of type double[]. Suppose that the goal is to add up all the numbers in the array. An informal algorithm for doing this would be:
Start with 0;
Add A[0]; (process the first item in A)
Add A[1]; (process the second item in A)
.
.
.
Add A[ A.Length - 1 ]; (process the last item in A)
Putting the obvious repetition into a loop and giving a name to the sum, this becomes:
double sum; // The sum of the numbers in A. sum = 0; // Start with 0. for (int i = 0; i < A.Length; i++) sum += A[i]; // add A[i] to the sum, for // i = 0, 1, ..., A.Length - 1
Note that the continuation condition, "i < A.Length", implies that the last value of i that is actually processed is A.Length-1, which is the index of the final item in the array. It's important to use "<" here, not "<=", since "<=" would give an array index out of bounds error.
Eventually, you should just about be able to write loops similar to this one in your sleep. I will give a few more simple examples. Here is a loop that will count the number of items in the array A which are less than zero:
int count; // For counting the items. count = 0; // Start with 0 items counted. for (int i = 0; i < A.Length; i++) { if (A[i] < 0.0) // if this item is less than zero... count++; // ...then count it } // At this point, the value of count is the number // of items that have passed the test of being < 0
Replace the test "A[i] < 0.0", if you want to count the number of items in an array that satisfy some other property. Here is a variation on the same theme. Suppose you want to count the number of times that an item in the array A is equal to the item that follows it. The item that follows A[i] in the array is A[i+1], so the test in this case is "if (A[i] == A[i+1])". But there is a catch: This test cannot be applied when A[i] is the last item in the array, since then there is no such item as A[i+1]. The result of trying to apply the test in this case would be an array index out of bounds error. This just means that we have to stop one item short of the final item:
int count = 0; for (int i = 0; i < A.Length - 1; i++) { if (A[i] == A[i+1]) count++; }
Another typical problem is to find the largest number in A. The strategy is to go through the array, keeping track of the largest number found so far. We'll store the largest number found so far in a variable called max. As we look through the array, whenever we find a number larger than the current value of max, we change the value of max to that larger value. After the whole array has been processed, max is the largest item in the array overall. The only question is, what should the original value of max be? One possibility is to start with max equal to A[0], and then to look through the rest of the array, starting from A[1], for larger items:
double max = A[0]; for (int i = 1; i < A.Length; i++) { if (A[i] > max) max = A[i]; } // at this point, max is the largest item in A
(There is one subtle problem here. It's possible in C# for an array to have length zero. In that case, A[0] doesn't exist, and the reference to A[0] in the first line gives an array index out of bounds error. However, zero-length arrays are normally something that you want to avoid in real problems. Anyway, what would it mean to ask for the largest item in an array that contains no items at all?)
As a final example of basic array operations, consider the problem of copying an array. To make a copy of our sample array A, it is not sufficient to say
double[] B = A;
since this does not create a new array object. All it does is declare a new array variable and make it refer to the same object to which A refers. (So that, for example, a change to A[i] will automatically change B[i] as well.) To make a new array that is a copy of A, it is necessary to make a new array object and to copy each of the individual items from A into the new array:
double[] B = new double[A.Length]; // Make a new array object, // the same size as A. for (int i = 0; i < A.Length; i++) B[i] = A[i]; // Copy each item from A to B.
An array type, such as double[], is a full-fledged C# type, so it can be used in all the ways that any other C# type can be used. In particular, it can be used as the type of a formal parameter in a method. It can even be the return type of a function. For example, it might be useful to have a function that makes a copy of an array of doubles:
double[] copy( double[] source ) { // Create and return a copy of the array, source. // If source is null, return null. if ( source == null ) { return null; } double[] cpy; // A copy of the source array. cpy = new double[source.Length]; for (int i = 0; i < src.Length; i++) { cpy[i] = src[i]; } return cpy; }
Several methods in the C# API receive and return array parameters. For example, the String class contains a Split() method that splits up a string into individual tokens, based on a specified delimiter character, and returns an array of the individual tokens. It's very useful; here's an example showing how to use it:
Example 11.1. StringSplit.cs
using System; class StringSplit { static void Main() { string nums = "56,35,26"; string[] numlist = nums.Split(','); // split string using , as the delimiter for (int i = 0; i < numlist.Length; i++) { string num = numlist[i]; Console.WriteLine("Number #" + i + " is: " + num); } } }
When using a command-line interface, the user types a command to tell the system to execute a program. The user can include extra input in this command, beyond the name of the program. This extra input becomes the command-line parameters. For example, the user can type
c:\> MyProg
to execute the program. In this case, there are no command-line parameters. But if the user types the command
c:\> MyProg one two three
then the command-line parameters are the strings "one", "two", and "three".
The Main() method of a program can have a parameter of type string[]. When the system calls the Main() routine, it places command-line parameters in this array, if you have defined it.
Here, for example, is a short program that simply prints out any command line parameters entered by the user:
class CLDemo { static void Main(string[] args) { Console.WriteLine("You entered " + args.Length + " command-line parameters."); if (args.Length > 0) { Console.WriteLine("They were:"); for (int i = 0; i < args.Length; i++) { Console.WriteLine(" " + args[i]); } } } // end main() } // end class CLDemo
Note that the parameter, args, is never null when Main() is called by the system, but it might be an array of length zero. In practice, command-line parameters are often the names of files to be processed by the program.
So far, all my examples of array processing have used sequential access. That is, the elements of the array were processed one after the other in the sequence in which they occur in the array. But one of the big advantages of arrays is that they allow random access. That is, every element of the array is equally accessible at any given time. As an example, let's look at a well-known problem called the birthday problem: Suppose that there are N people in a room. What's the chance that there are two people in the room who have the same birthday? (That is, they were born on the same day in the same month, but not necessarily in the same year.) Most people severely underestimate the probability. We actually look at a different version of the problem: Suppose you choose people at random and check their birthdays. How many people will you check before you find one who has the same birthday as someone you've already checked? Of course, the answer in a particular case depends on random factors, but we can simulate the experiment with a computer program and run the program several times to get an idea of how many people need to be checked on average. To simulate the experiment, we need to keep track of each birthday that we find. There are 365 different possible birthdays. (We'll ignore leap years.) For each possible birthday, we need to know, has this birthday already been used? The answer is a boolean value, true or false. To hold this data, we can use an array of 365 boolean values:
bool[] used; used = new bool[365];
The days of the year are numbered from 0 to 364. The value of used[i] is true if someone has been selected whose birthday is day number i. Initially, all the values in the array, used, are false. When we select someone whose birthday is day number i, we first check whether used[i] is true. If so, then this is the second person with that birthday. We are done. If used[i] is false, we set used[i] to be true to record the fact that we've encountered someone with that birthday, and we go on to the next person. Here is a program that carries out the simulated experiment (Of course, in the program, there are no simulated people, only simulated birthdays):
class Birthday { static void Main() { // Simulate choosing people at random and checking the // day of the year they were born on. If the birthday // is the same as one that was seen previously, stop, // and output the number of people who were checked. bool[] used; // For recording the possible birthdays // that have been seen so far. A value // of true in used[i] means that a person // whose birthday is the i-th day of the // year has been found. int count; // The number of people who have been checked. used = new bool[365]; // Initially, all entries are false. count = 0; Random rand = new Random(); bool done = false; while (!done) { // Select a birthday at random, from 0 to 364. // If the birthday has already been used, quit. // Otherwise, record the birthday as used. int birthday; // The selected birthday. birthday = rand.Next(365); count++; if ( used[birthday] ) done = true; used[birthday] = true; } Console.WriteLine("A duplicate birthday was found after " + count + " tries."); } // end main() }
This program makes essential use of the fact that every element in a newly created array of booleans is set to be false. If we wanted to reuse the same array in a second simulation, we would have to reset all the elements in it to be false with a for loop
for (int i = 0; i < 365; i++) used[i] = false;