2. Information Hiding

One of the reasons that objects like Strings and StringTokenizers are so useful is that programmers can use them without having to know how they work. Part of the reason for that is that these two classes are designed in such a way that the programmer using the class never has to know what is in its instance variables, or even what the instance variables are. The programmer interacts with the object only by calling methods.

In addition to the idea of encapsulation -- packaging data and methods together -- another central concept of Object Oriented Programming is information hiding. Information hiding is the idea that objects can protect their state from direct access by code outside the object.

The SimpleStringTokenizer class introduced in the last section uses information hiding to protect its state. A programmer using the SimpleStringTokenizer class would not be allowed to access any of its instance variables directly. For example, if the programmer writing the while loop in the TokenTest program tried to write the loop this way:

while (tok.curIndex < tok.data.Length) {
  String word = tok.NextToken();
  Console.WriteLine(word);
}

The compiler would reject the code, because the curIndex and data instance variables are not marked public.

So, why not mark the instance variables public, so programmers using the class could write a while loop like this one? The while loop is actually legitimate, but verifying that it is correct requires detailed knowledge of how SimpleStringTokenizer works. In other words, in order to understand this code, the programmer reading it would have to look into the SimpleStringTokenizer class and examine all the methods carefully to understand the relationship between curIndex and data. The advantage of abstraction is gone.

This is just one example of the drawbacks that can occur when programmers can access an object's state directly.

2.1. Using Visibility Modifiers

You can use visibility modifiers to protect an object's state. Here is a class for Time objects, with visibility modifiers marked in bold:

class Time {
  private int hour, minute;
  private char ampm;
    
  public Time(int initHour, int initMinute, char initAmPm) {
    hour = initHour;
    minute = initMinute;
    ampm = initAmPm;
  }
  
  public void Show() {
    Console.WriteLine(hour + ":" + minute + " " + ampm);
  }
  
}

The keywords private and public are called visibility modifiers because they modify the default visibility of class members. You can use them when you define either a member variable or a method. They either permit or restrict access to the member from code outside the class.

Since the Time instance variables are marked private, code outside the Time class cannot access them like this:

class TimeDemo {
  static void Main() {
    Time now = new Time(12, 48, 'p'); // 12:48 pm
    now.Show();              // OK
    now.hour = 5;               // ACCESS VIOLATION
    Console.WriteLine(now.minute);  // ACCESS VIOLATION

  }
}

The compiler enforces information hiding by issuing a compile error when code attempts to access private members of a class.

Here are some points to keep in mind about visibility modifiers:

  • Visibility modifiers have no effect on code inside the Time class. Use them to prevent code outside the Time class from accessing certain members inside the Time class.

  • As a general rule, all instance variables should be marked private, and methods and constructors should be marked public. This forces code outside the class to use the object only by calling its methods, enforcing the principle of information hiding.

  • Unless marked otherwise, class members have a default visibility that is private. Thus, you don't technically have to mark instance variables private to protect them from outside access, but it's good practice to do so.

2.2. Accessor and Mutator Methods

Our Time class with its visibility modifiers is rather limited in its usefulness. Because its instance variables are private, once a Time object has been created, it can't be changed, and it's not possible for code outside the class to determine the individual values of hour and minte. This isn't good.

At the very least, code outside the Time class needs to be able to determine the values of its instance variables. There may also be times when, after a Time is created, we need to be able to change its hour or minute (or both). Let's add some public methods to provide the necessary access:

class Time {
  private int hour, minute;
  private char ampm;
    
  public Time(int initHour, int initMinute, char initAmPm) {
    hour = initHour;
    minute = initMinute;
    ampm = initAmPm;
  }
  
  public void Show() {
    Console.WriteLine(hour + ":" + minute + " " + ampm);
  }

  // accessor methods

  public int GetHour() { return hour; }
  public int GetMinute() { return minute; }
  public char GetAmpm() { return ampm; }
  
  // mutator methods
  
  public void SetHour(int newHour) { hour = newHour; }
  public void SetMinute(int newMinute) { minute = newMinute; }
  public void SetAmpm(char newAmpm) { ampm = newAmpm; }
    
}

I have added some accessor and mutator methods to the class.

  • An accessor method is a method that allows code outside the class to access the value of an instance variable. Accessor methods are named getattributename(), where attributename is the name of the instance variable whose value is returned by the method. An accessor method always has this format:

    public type Getattribute() { return attribute; }

    where attribute is a private instance variable in the class, and type is the attribute's data type.

    GetHour() and GetAmpm() are both examples of accessor methods.

  • A mutator method is a method that allows code outside the class to change the value of an instance variable. Mutator methods are named setattribute(), where attribute is the name of the instance variable the method allows callers to change. A mutator method usually has this format:

    public void Setattribute(type newValue) { attribute = newValue; }

    where attribute is a private instance variable in the class, and type is the attribute's data type..

Once we've created accessor and mutator methods, we can use them to read or change the object's state. Let's say we've created a Time object:

Time now = new Time(10, 48, 'p');

Now suppose we want to display now's hour value. We can't do this:

Console.WriteLine(now.hour); // won't work because hour is private

But we can use the getHour() accessor method to obtain the data, like this:

Console.WriteLine(now.GetHour()); // use the accessor method 

What about changing now's am/pm indicator? This won't work:

now.ampm = 'a'; 

But we can ask the setAmpm() mutator method to change it for us:

now.SetAmpm('a');

Basically, anytime you would want to use an instance variable in an expression, you call its accessor method; anytime you want to change an instance variable's value with an assignment statement, you call its mutator method.

On the surface, it may appear that, by making instance variables private to enforce information hiding, we've complicated life considerably and created a lot of extra work for ourselves. In the case of this Time class, that's certainly true. It would be much simpler for code that uses the Time class to be able to have direct access to Time's instance variables.

But think back to the SimpleStringTokenizer example in the previous section. If we mark the instance variables in that class private, there would be no need to write any accessor or mutator methods, because the main program never needs to access SimpleStringTokenizer's instance variables. Information hiding makes a lot more sense for the SimpleStringTokenizer, and doesn't complicate life at all.

2.3. Objects that "Just Say No"

Many objects need to be able to enforce constraints on their instance variables. Consider the Time class. A valid Time object should have an hour value between 1 and 12 (inclusive); a minute value between 0 and 59 (inclusive); and an ampm value of 'a' or 'p'. For example, the following Time object would not make sense:

Time when = new Time(27, -35, 'x'); // not a valid Time

It would be nice if we could give objects the ability to "Just say No!" to bad data. And with information hiding, we can! By marking instance variables private, we deny code outside the class the ability to change their values directly. Outside code must ask methods inside the class to change the values. By including code in the constructor and in mutator methods that rejects invalid values, the object can protect its state from attempts to violate the constraints. Here's how we might do it in the Time class:

class Time {
  private int hour, minute;
  private char ampm;
    
  public Time(int initHour, int initMinute, char initAmPm) {
    if (initHour >= 1 && initHour <= 12 && initMinute >= 0 && initMinute <= 59 && (initAmpm == 'a' || initAmpm == 'p')) {
      hour = initHour;
      minute = initMinute;
      ampm = initAmPm;
    } else {
      hour = 12;
      minute = 0;
      ampm = 'a';
    }
  }
  
  public void Show() {
    Console.WriteLine(hour + ":" + minute + " " + ampm);
  }

  // accessor methods

  public int GetHour() { return hour; }
  public int GetMinute() { return minute; }
  public char GetAmpm() { return ampm; }
  
  // mutator methods
  
  public void SetHour(int newHour) { 
    if (hour >= 1 && hour <= 12) {
      hour = newHour; 
    }
  }
  
  public void SetMinute(int newMinute) { 
    if (minute >= 0 && minute <= 59) {
      minute = newMinute; 
    }
  }
  
  public void SetAmpm(char newAmpm) { 
    if (ampm == 'a' || ampm == 'p') {
      ampm = newAmpm; 
    }
  }
    
}

The changes required to enforce the constraints are shown in bold. Look first at the mutator methods: each checks to make sure the proposed new value is valid before changing the instance variable. Invalid values are silently rejected, and the object's state is protected.

Now, look at the constructor. If an object is to ensure that its state is valid, even from the time it is instantiated, the constructor must enforce valid state. The complicated boolean expression is true only for combinations of values that produce a valid Time. For example, a time of 5:09 p (hour = 5, minute = 9, ampm = 'p') makes the expression true; but a time of 9:93 a (hour = 9, minute = 93, ampm = 'a') makes it false. I created the expression by combining all of the individual tests in the mutator methods using the && operator. If code outside the class attempts to create an invalid Time object, the constructor silently 'rejects' the attempt by initializing the Time object to midnight.

There is some duplication of code in this design that is undesirable. The constraint checks in the constructor are duplicated (on a smaller scale) in each mutator method. It would be nice to centralize the checks in one method, so they only have to be defined in one place. Here's one way to do it:

using System;

class Time {
  private int hour, minute;
  private char ampm;
    
  public Time(int initHour, int initMinute, char initAmPm) {
    if (IsValid(initHour, initMinute, initAmPm)) {
      hour = initHour;
      minute = initMinute;
      ampm = initAmPm;
    } else {
      hour = 12;
      minute = 0;
      ampm = 'a';
    }
  }
  
  public void Show() {
    Console.WriteLine(hour + ":" + minute + " " + ampm);
  }

  // accessor methods

  public int GetHour() { return hour; }
  public int GetMinute() { return minute; }
  public char GetAmpm() { return ampm; }
  
  // mutator methods
  
  public void SetHour(int newHour) { 
    if (IsValid(newHour, minute, ampm)) {
      hour = newHour; 
    }
  }
  
  public void SetMinute(int newMinute) { 
    if (IsValid(hour, newMinute, ampm)) {
      minute = newMinute; 
    }
  }
  
  public void SetAmpm(char newAmpm) { 
    if (IsValid(hour, minute, newAmpm)) {
      ampm = newAmpm; 
    }
  }
  
  // returns true if <theHour>, <theMinute>, and <theAmpm> form a valid Time; false, otherwise
  static bool IsValid(int theHour, int theMinute, int theAmpm) {
    return (theHour >= 1 && theHour <= 12 && theMinute >= 0 && theMinute <= 59 && (theAmpm == 'a' || theAmpm == 'p'));
  }
    
}

Again, changes are shown in bold. Notice how we've centralized the constraint checking code in one method: IsValid(). The constructor and all the mutator methods call isValid( ) before they change the instance variables to make sure the proposed new values represent a valid time. Thus, IsValid( ) does not test the instance variables to see if they represent valid values, because we want to be able to test values before storing them in the instance variables. Since IsValid() doesn't access any instance variables, I've marked it static. That means that code outside the class could use it before creating a Time instance to see if the proposed values represent a valid Time, like this:

Console.Write("Enter the hour: ");
int hours = Convert.ToInt32(Console.ReadLine());
Console.Write("Enter the minute: ");
int minutes = Convert.ToInt32(Console.ReadLine());
Console.Write("Enter a for 'am' or p for 'pm': ");
char ampm = Console.readChar();
if (Time.IsValid(hours, minutes, ampm)) {
  Time now = new Time(hours, minutes, ampm);
  ...
} else {
  Console.WriteLine("You didn't enter a valid time!");
}

Expressions like the one in the IsValid() method are called class invariants. A class invariant is a boolean expression that is true if and only if the values in the expression represent a valid state for the object. In other words, it expresses the object's constraints on the instance variables. By centralizing the class invariant in one method, we can streamline the code needed to protect an object's state by collecting it in one place, instead of having to duplicate specific tests throughout the code.

2.4. Protecting Outside Code from Implementation Changes

When we use information hiding to prevent code outside a class from directly accessing instance variables, we get another important benefit. We gain the freedom to change the implementation of a class's methods without worrying about breaking code that uses the class.

Consider the following implementation of the Time class:

using System;

class Time {
  private int minSinceMidnight; // the number of minutes since midnight
    
  public Time(int initHour, int initMinute, char initAmPm) {
    if (isValid(initHour, initMinute, initAmPm)) {
      SetTime(initHour, initMinute, initAmPm);
    } else {
      SetTime(12, 0, 'a');
    }
  }
  
  public void Show() {
    Console.WriteLine(GetHour() + ":" + GetMinute() + " " + GetAmpm());
  }

  // accessor methods

  public int GetHour() { 
    int hour = minSinceMidnight / 60;
    if (hour > 12) {
      hour -= 12;      
    } else if (hour == 0) {
      hour = 12;
    }
    
    return hour; 
  }
  
  public int GetMinute() { 
    return minSinceMidnight % 60; 
  }
  
  public char GetAmpm() { 
    if (minSinceMidnight >= 12*60) {
      return 'p';
    } else {
      return 'a';
    }
  }
  
  // mutator methods
  
  public void SetHour(int newHour) { 
    if (IsValid(newHour, GetMinute(), GetAmpm())) {
      setTime(newHour, GetMinute(), GetAmpm()); 
    }
  }
  
  public void SetMinute(int newMinute) { 
    if (IsValid(GetHour(), newMinute, GetAmpm())) {
      setTime(GetHour(), newMinute, GetAmpm()); 
    }
  }
  
  public void SetAmpm(char newAmpm) { 
    if (IsValid(GetHour(), GetMinute(), newAmpm)) {
      SetTime(GetHour(), GetMinute(), newAmpm); 
    }
  }
  
  // returns true if <theHour>, <theMinute>, and <theAmpm> form a valid Time; false, otherwise
  static bool IsValid(int theHour, int theMinute, int theAmpm) {
    return (theHour >= 1 && theHour <= 12 && theMinute >= 0 && theMinute <= 59 && (theAmpm == 'a' || theAmpm == 'p'));
  }
  
  // for internal use only
  private void SetTime(int theHour, int theMinute, int theAmpm) {
    int hour24 = theHour;
    
    if (hour24 == 12) {
      hour24 -= 12;
    }
    
    if (theAmpm == 'p') {
      hour24 = theHour + 12;
    } 
    
    minSinceMidnight = hour24 * 60 + theMinute;    
      
  }
    
}

This Time class, to the outside world, works exactly like the original Time class. By that, I mean the main program uses it in exactly the same way. But inside the class, the way the time is stored has changed drastically. Now, a single instance variable stores the number of minutes since midnight.

2.5. Information Hiding and the Point Class

After this discussion of information hiding, you may be wondering if information hiding is always appropriate, or if there are situations where it's not worth the effort.

Remember the Point class? Information hiding probably doesn't make sense for the Point class for the following reasons:

  • Programmers using the Point class are very much aware that a Point represents an X and a Y coordinate. There's no reason to hide the variables.

  • There are no constraints on what constitutes legal state for Point objects. It's ok for the x and y coordinates to be any integer value.

  • It's unlikely that we would want to change the implementation of Point, by changing the names of its instance variables or their data types.

Information hiding would make sense for the Point class if one of the following situations applied:

  • Let's say we want Point objects to be immutable. In other words, once created, a Point object's coordinates should never be changed. To enforce this, we could mark its instance variables private, and provide accessor methods to allow code outside the class to read its state. We wouldn't provide any mutator methods. Without mutator methods, there is no way for code outside the class to change the Point's location.

  • Perhaps we want Point objects to be mutable, but we need to enforce some constraint on the instance variables. For example, perhaps a Point's coordinates should never be allowed to be negative. Once again, we could enforce this by making the instance variables private, and providing accessor methods. We would also provide mutator methods, but they would enforce the constraints by refusing to process requests to change the variables to an illegal value.