3. Files

THE DATA AND PROGRAMS IN A COMPUTER'S MAIN MEMORY survive only as long as the power is on. For more permanent storage, computers use files, which are collections of data stored on a hard disk, on a floppy disk, on a CD-ROM, or on some other type of storage device. Files are organized into directories (sometimes called "folders"). A directory can hold other directories, as well as files. Both directories and files have names that are used to identify them.

Fundamentally, a file is an ordered collection of bytes that has been given a name. Files can be categorized into two broad categories: text files and binary files. Both kinds of files contain an ordered collection of bytes, but while binary files can contain both printable and non-printable bytes, text files are usually limited to containing printable characters.

Programs can read data from existing files. They can create new files and can write data to files. In C#, such input and output is done using streams. StreamReader objects read data from text files, and StreamWriter objects create and write data to text files. For files that store data in machine format, other classes are used. In this section, I will only discuss text-oriented file I/O using the StreamReader and StreamWriter classes.

3.1. Text Files

Text files contain printable characters divided into lines. When you view a text file in a text editor such as Notepad, you can easily see where one line of text ends and the next begins, because Notepad visually shows the line breaks. When the text file is saved to disk, the editor must put a special marker at the end of each line so that the line breaks are not lost. On disk, each line of text is separated by a two-character sequence: a carriage return character followed by a newline character ("\r\n"). For example, consider a file of text as viewed in Notepad:

Figure 7.1. A Text File

A Text File

On disk, the file would be represented this way:

AB\r\nCDE\r\n\r\nFG

When Notepad loads a text file from disk, it knows where to break the lines, because the \r\n sequence indicates the end of each line.

3.2. Reading Files

To write a C# program that reads a text file, you use the StreamReader class. It has a constructor which takes the name of a file as a parameter and creates an input stream that can be used for reading from that file. For example, to create an input stream to read from a file named "data.txt", you could write:

StreamReader rd = new StreamReader("data.txt");

Once you have created the input stream object (rd in this example), you can do two things with it:

  • Read data from the file (using the Read() and ReadLine() methods)

  • Close the file (using the Close() method) when you're finished reading data from it

The Read() method returns the next character from the file, but it returns it as an int, not a char:

int nextChar = rd.Read();

Each time you call Read(), the next character is returned. When you've read all the characters in the file, -1 is returned (that explains why Read() returns an int; it has to be able to return something that indicates that the end of file has been reached).

Here's a program that reads all the data from a text file and displays it on the screen:

Example 7.1. ShowFile.cs

using System;
using System.IO;

class ShowFile {

  static void Main() {
      StreamReader file = new StreamReader("data.txt");
      int ch = file.Read();
      while (ch != -1) {
        Console.Write((char)ch);
        ch = file.Read();
      }
      file.Close();
  }
}

Notice the statement

using System.IO;

at the top of the program. The StreamReader class, unlike all of the other classes we have used to this point, is not in the System namespace. It is in the System.IO namespace. The using System.IO statement allows us to use the StreamReader class.

StreamReaders also allow you to read data using the ReadLine() method. The ReadLine() method reads the next line of data from the text file and returns it. Eventually, ReadLine() will return the last line in the file, and subsequent calls will return a null value. I will have more to say about null values in the next chapter, but for now, you should know how to test for null. Do it using == or !=, like this:

if (line == null) {
  // we hit the end of file
}

I want to emphasize that null is not a string value. Instead, it is a special value indicating that the variable does not hold a string. It is different from an empty string. You have to be careful not to call any methods or access properties on a string variable when it holds a null value, because that will result in a NullReferenceException. That's why you shouldn't test for null like this:

// THIS CAN CAUSE A CRASH
if (line.Length == 0) { ... }

If line is null, then calling the equals method will cause a crash. When you test for null, always use the relational operators.

Here's a program that displays the contents of a text file using the ReadLine() method.

Example 7.2. ShowFile2.cs

using System;
using System.IO;

class ShowFile2 {

  static void Main() {

    try {
      StreamReader file = new StreamReader("data.txt");

      string line = file.ReadLine();
      while (line != null) {
        Console.WriteLine(line);
        line = file.ReadLine();
      }
      file.Close( );

    } catch (IOException e) {
      Console.WriteLine("Uh oh! Problem reading the file: " + e.Message);
    }

  }
}

Notice particularly the while loop condition:

line != null

This condition is true until the end of file is reached.

Also, note the use of a try-catch block to gracefully handle exceptions that may occur, such as file not found, or problems reading data from the disk.

3.3. Writing Files

Writing data to files is much like reading data. You simply create an object belonging to the class StreamWriter. For example, suppose you want to write data to a file named "result.txt". You might use code like the following:

       
      StreamWriter result = new StreamWriter("result.txt");
      result.WriteLine("Here is the first line");
      result.WriteLine("Here is the second line");
      result.Close();

If no file named result.txt exists, a new file will be created. If the file already exists, then the current contents of the file will be erased and replaced with the data that your program writes to the file. An IOException might occur if, for example, you are trying to create a file on a disk that is "write-protected," meaning that it cannot be modified.

After you are finished using a file, it's a good idea to close the file, to tell the operating system that you are finished using it. (If you forget to do this, the file will ordinarily be closed automatically when the program terminates or when the file stream object is garbage collected, but it's best to close a file as soon as you are done with it.) You can close a file by calling the Close() method of the associated stream. Once a file has been closed, it is no longer possible to read data from it or write data to it, unless you open it again as a new stream.