2: Making & Using Objects

[ Viewing Hints ] [ Exercise Solutions ] [ Volume 2 ] [ Free Newsletter ]
[ Seminars ] [ Seminars on CD ROM ] [ Consulting ]

Thinking in C++, 2nd ed. Volume 1

©2000 by Bruce Eckel

[ Previous Chapter ] [ Table of Contents ] [ Index ] [ Next Chapter ]

This chapter will introduce enough C++ syntax and program construction concepts to allow you to write
and run some simple object-oriented programs. In the subsequent chapter we will cover the basic syntax of C and C++ in detail.

By reading this chapter first, you’ll get the basic flavor of what it is like to program with objects in C++, and you’ll also discover some of the reasons for the enthusiasm surrounding this language. This should be enough to carry you through Chapter 3, which can be a bit exhausting since it contains most of the details of the C language.

The user-defined data type , or class, is what distinguishes C++ from traditional procedural languages. A class is a new data type that you or someone else creates to solve a particular kind of problem. Once a class is created, anyone can use it without knowing the specifics of how it works, or even how classes are built. This chapter treats classes as if they are just another built-in data type available for use in programs.

Classes that someone else has created are typically packaged into a library. This chapter uses several of the class libraries that come with all C++ implementations. An especially important standard library is iostreams, which (among other things) allow you to read from files and the keyboard, and to write to files and the display. You’ll also see the very handy string class, and the vector container from the Standard C++ Library. By the end of the chapter, you’ll see how easy it is to use a pre-defined library of classes.

In order to create your first program you must understand the tools used to build applications.

The process of language translation

All computer languages are translated from something that tends to be easy for a human to understand (source code) into something that is executed on a computer (machine instructions). Traditionally, translators fall into two classes: interpreters and compilers.

Interpreters

An interpreter translates source code into activities (which may comprise groups of machine instructions) and immediately executes those activities. BASIC, for example, has been a popular interpreted language. Traditional BASIC interpreters translate and execute one line at a time, and then forget that the line has been translated. This makes them slow, since they must re-translate any repeated code. BASIC has also been compiled, for speed. More modern interpreters, such as those for the Python language, translate the entire program into an intermediate language that is then executed by a much faster interpreter[25].

Interpreters have many advantages. The transition from writing code to executing code is almost immediate, and the source code is always available so the interpreter can be much more specific when an error occurs. The benefits often cited for interpreters are ease of interaction and rapid development (but not necessarily execution) of programs.

Interpreted languages often have severe limitations when building large projects (Python seems to be an exception to this). The interpreter (or a reduced version) must always be in memory to execute the code, and even the fastest interpreter may introduce unacceptable speed restrictions. Most interpreters require that the complete source code be brought into the interpreter all at once. Not only does this introduce a space limitation, it can also cause more difficult bugs if the language doesn’t provide facilities to localize the effect of different pieces of code.

Compilers

A compiler translates source code directly into assembly language or machine instructions. The eventual end product is a file or files containing machine code. This is an involved process, and usually takes several steps. The transition from writing code to executing code is significantly longer with a compiler.

Depending on the acumen of the compiler writer, programs generated by a compiler tend to require much less space to run, and they run much more quickly. Although size and speed are probably the most often cited reasons for using a compiler, in many situations they aren’t the most important reasons. Some languages (such as C) are designed to allow pieces of a program to be compiled independently. These pieces are eventually combined into a final executable program by a tool called the linker. This process is called separate compilation .

Separate compilation has many benefits. A program that, taken all at once, would exceed the limits of the compiler or the compiling environment can be compiled in pieces. Programs can be built and tested one piece at a time. Once a piece is working, it can be saved and treated as a building block. Collections of tested and working pieces can be combined into libraries for use by other programmers. As each piece is created, the complexity of the other pieces is hidden. All these features support the creation of large programs[26].

Compiler debugging features have improved significantly over time. Early compilers only generated machine code, and the programmer inserted print statements to see what was going on. This is not always effective. Modern compilers can insert information about the source code into the executable program. This information is used by powerful source-level debuggers to show exactly what is happening in a program by tracing its progress through the source code.

Some compilers tackle the compilation-speed problem by performing in-memory compilation. Most compilers work with files, reading and writing them in each step of the compilation process. In-memory compilers keep the compiler program in RAM. For small programs, this can seem as responsive as an interpreter.

The compilation process

To program in C and C++, you need to understand the steps and tools in the compilation process. Some languages (C and C++, in particular) start compilation by running a preprocessor on the source code. The preprocessor is a simple program that replaces patterns in the source code with other patterns the programmer has defined (using preprocessor directives). Preprocessor directives are used to save typing and to increase the readability of the code. (Later in the book, you’ll learn how the design of C++ is meant to discourage much of the use of the preprocessor, since it can cause subtle bugs.) The pre-processed code is often written to an intermediate file.

Compilers usually do their work in two passes. The first pass parses the pre-processed code. The compiler breaks the source code into small units and organizes it into a structure called a tree. In the expression “A + B” the elements ‘A’, ‘+,’ and ‘B’ are leaves on the parse tree.

A global optimizer is sometimes used between the first and second passes to produce smaller, faster code.

In the second pass, the code generator walks through the parse tree and generates either assembly language code or machine code for the nodes of the tree. If the code generator creates assembly code, the assembler must then be run. The end result in both cases is an object module (a file that typically has an extension of .o or .obj). A peephole optimizer is sometimes used in the second pass to look for pieces of code containing redundant assembly-language statements.

The use of the word “object” to describe chunks of machine code is an unfortunate artifact. The word came into use before object-oriented programming was in general use. “Object” is used in the same sense as “goal” when discussing compilation, while in object-oriented programming it means “a thing with boundaries.”

The linker combines a list of object modules into an executable program that can be loaded and run by the operating system. When a function in one object module makes a reference to a function or variable in another object module, the linker resolves these references; it makes sure that all the external functions and data you claimed existed during compilation do exist. The linker also adds a special object module to perform start-up activities.

The linker can search through special files called libraries in order to resolve all its references. A library contains a collection of object modules in a single file. A library is created and maintained by a program called a librarian.

Static type checking

The compiler performs type checking during the first pass. Type checking tests for the proper use of arguments in functions and prevents many kinds of programming errors. Since type checking occurs during compilation instead of when the program is running, it is called static type checking.

Some object-oriented languages (notably Java) perform some type checking at runtime (dynamic type checking). If combined with static type checking, dynamic type checking is more powerful than static type checking alone. However, it also adds overhead to program execution.

C++ uses static type checking because the language cannot assume any particular runtime support for bad operations. Static type checking notifies the programmer about misuses of types during compilation, and thus maximizes execution speed. As you learn C++, you will see that most of the language design decisions favor the same kind of high-speed, production-oriented programming the C language is famous for.

You can disable static type checking in C++. You can also do your own dynamic type checking – you just need to write the code.

Tools for separate compilation

Separate compilation is particularly important when building large projects. In C and C++, a program can be created in small, manageable, independently tested pieces. The most fundamental tool for breaking a program up into pieces is the ability to create named subroutines or subprograms. In C and C++, a subprogram is called a function, and functions are the pieces of code that can be placed in different files, enabling separate compilation. Put another way, the function is the atomic unit of code, since you cannot have part of a function in one file and another part in a different file; the entire function must be placed in a single file (although files can and do contain more than one function).

When you call a function, you typically pass it some arguments, which are values you’d like the function to work with during its execution. When the function is finished, you typically get back a return value, a value that the function hands back to you as a result. It’s also possible to write functions that take no arguments and return no values.

To create a program with multiple files, functions in one file must access functions and data in other files. When compiling a file, the C or C++ compiler must know about the functions and data in the other files, in particular their names and proper usage. The compiler ensures that functions and data are used correctly. This process of “telling the compiler” the names of external functions and data and what they should look like is called declaration. Once you declare a function or variable, the compiler knows how to check to make sure it is used properly.

Declarations vs. definitions

It’s important to understand the difference between declarations and definitions because these terms will be used precisely throughout the book. Essentially all C and C++ programs require declarations. Before you can write your first program, you need to understand the proper way to write a declaration.

A declaration introduces a name – an identifier – to the compiler. It tells the compiler “This function or this variable exists somewhere, and here is what it should look like.” A definition, on the other hand, says: “Make this variable here” or “Make this function here.” It allocates storage for the name. This meaning works whether you’re talking about a variable or a function; in either case, at the point of definition the compiler allocates storage. For a variable, the compiler determines how big that variable is and causes space to be generated in memory to hold the data for that variable. For a function, the compiler generates code, which ends up occupying storage in memory.

You can declare a variable or a function in many different places, but there must be only one definition in C and C++ (this is sometimes called the ODR: one-definition rule). When the linker is uniting all the object modules, it will usually complain if it finds more than one definition for the same function or variable.

A definition can also be a declaration. If the compiler hasn’t seen the name x before and you define int x;, the compiler sees the name as a declaration and allocates storage for it all at once.

Function declaration syntax

A function declaration in C and C++ gives the function name, the argument types passed to the function, and the return value of the function. For example, here is a declaration for a function called func1( ) that takes two integer arguments (integers are denoted in C/C++ with the keyword int) and returns an integer:

int func1(int,int);

The first keyword you see is the return value all by itself: int. The arguments are enclosed in parentheses after the function name in the order they are used. The semicolon indicates the end of a statement; in this case, it tells the compiler “that’s all – there is no function definition here!”

C and C++ declarations attempt to mimic the form of the item’s use. For example, if a is another integer the above function might be used this way:

a = func1(2,3);

Since func1( ) returns an integer, the C or C++ compiler will check the use of func1( ) to make sure that a can accept the return value and that the arguments are appropriate.

Arguments in function declarations may have names. The compiler ignores the names but they can be helpful as mnemonic devices for the user. For example, we can declare func1( ) in a different fashion that has the same meaning:

int func1(int length, int width);

A gotcha

There is a significant difference between C and C++ for functions with empty argument lists. In C, the declaration:

int func2();

means “a function with any number and type of argument.” This prevents type-checking, so in C++ it means “a function with no arguments.”

Function definitions

Function definitions look like function declarations except that they have bodies. A body is a collection of statements enclosed in braces. Braces denote the beginning and ending of a block of code. To give func1( ) a definition that is an empty body (a body containing no code), write:

int func1(int length, int width) { }

Notice that in the function definition, the braces replace the semicolon. Since braces surround a statement or group of statements, you don’t need a semicolon. Notice also that the arguments in the function definition must have names if you want to use the arguments in the function body (since they are never used here, they are optional).

Variable declaration syntax

The meaning attributed to the phrase “variable declaration” has historically been confusing and contradictory, and it’s important that you understand the correct definition so you can read code properly. A variable declaration tells the compiler what a variable looks like. It says, “I know you haven’t seen this name before, but I promise it exists someplace, and it’s a variable of X type.”

In a function declaration, you give a type (the return value), the function name, the argument list, and a semicolon. That’s enough for the compiler to figure out that it’s a declaration and what the function should look like. By inference, a variable declaration might be a type followed by a name. For example:

int a;

could declare the variable a as an integer, using the logic above. Here’s the conflict: there is enough information in the code above for the compiler to create space for an integer called a, and that’s what happens. To resolve this dilemma, a keyword was necessary for C and C++ to say “This is only a declaration; it’s defined elsewhere.” The keyword is extern . It can mean the definition is external to the file, or that the definition occurs later in the file.

Declaring a variable without defining it means using the extern keyword before a description of the variable, like this:

extern int a;

extern can also apply to function declarations. For func1( ), it looks like this:

extern int func1(int length, int width);

This statement is equivalent to the previous func1( ) declarations. Since there is no function body, the compiler must treat it as a function declaration rather than a function definition. The extern keyword is thus superfluous and optional for function declarations. It is probably unfortunate that the designers of C did not require the use of extern for function declarations; it would have been more consistent and less confusing (but would have required more typing, which probably explains the decision).

Here are some more examples of declarations:

//: C02:Declare.cpp
// Declaration & definition examples
extern int i; // Declaration without definition
extern float f(float); // Function declaration

float b;  // Declaration & definition
float f(float a) {  // Definition
  return a + 1.0;
}

int i; // Definition
int h(int x) { // Declaration & definition
  return x + 1;
}

int main() {
  b = 1.0;
  i = 2;
  f(b);
  h(i);
} ///:~

In the function declarations, the argument identifiers are optional. In the definitions, they are required (the identifiers are required only in C, not C++).

Including headers

Most libraries contain significant numbers of functions and variables. To save work and ensure consistency when making the external declarations for these items, C and C++ use a device called the header file. A header file is a file containing the external declarations for a library; it conventionally has a file name extension of ‘h’, such as headerfile.h. (You may also see some older code using different extensions, such as .hxx or .hpp, but this is becoming rare.)

The programmer who creates the library provides the header file. To declare the functions and external variables in the library, the user simply includes the header file. To include a header file, use the #include preprocessor directive. This tells the preprocessor to open the named header file and insert its contents where the #include statement appears. A #include may name a file in two ways: in angle brackets (< >) or in double quotes.

File names in angle brackets, such as:

#include <header>

cause the preprocessor to search for the file in a way that is particular to your implementation, but typically there’s some kind of “include search path” that you specify in your environment or on the compiler command line. The mechanism for setting the search path varies between machines, operating systems, and C++ implementations, and may require some investigation on your part.

File names in double quotes, such as:

#include "local.h"

tell the preprocessor to search for the file in (according to the specification) an “implementation-defined way.” What this typically means is to search for the file relative to the current directory. If the file is not found, then the include directive is reprocessed as if it had angle brackets instead of quotes.

To include the iostream header file, you write:

#include <iostream>

The preprocessor will find the iostream header file (often in a subdirectory called “include”) and insert it.

Standard C++ include format

As C++ evolved, different compiler vendors chose different extensions for file names. In addition, various operating systems have different restrictions on file names, in particular on name length. These issues caused source code portability problems. To smooth over these rough edges, the standard uses a format that allows file names longer than the notorious eight characters and eliminates the extension. For example, instead of the old style of including iostream.h, which looks like this:

#include <iostream.h>

you can now write:

#include <iostream>

The translator can implement the include statements in a way that suits the needs of that particular compiler and operating system, if necessary truncating the name and adding an extension. Of course, you can also copy the headers given you by your compiler vendor to ones without extensions if you want to use this style before a vendor has provided support for it.

The libraries that have been inherited from C are still available with the traditional ‘.h’ extension. However, you can also use them with the more modern C++ include style by prepending a “c” before the name. Thus:

#include <stdio.h>
#include <stdlib.h>

become:

#include <cstdio>
#include <cstdlib>

And so on, for all the Standard C headers. This provides a nice distinction to the reader indicating when you’re using C versus C++ libraries.

The effect of the new include format is not identical to the old: using the .h gives you the older, non-template version, and omitting the .h gives you the new templatized version. You’ll usually have problems if you try to intermix the two forms in a single program.

Linking

The linker collects object modules (which often use file name extensions like .o or .obj), generated by the compiler, into an executable program the operating system can load and run. It is the last phase of the compilation process.

Linker characteristics vary from system to system. In general, you just tell the linker the names of the object modules and libraries you want linked together, and the name of the executable, and it goes to work. Some systems require you to invoke the linker yourself. With most C++ packages you invoke the linker through the C++ compiler. In many situations, the linker is invoked for you invisibly.

Some older linkers won’t search object files and libraries more than once, and they search through the list you give them from left to right. This means that the order of object files and libraries can be important. If you have a mysterious problem that doesn’t show up until link time, one possibility is the order in which the files are given to the linker.

Using libraries

Now that you know the basic terminology, you can understand how to use a library. To use a library:

Include the library’s header file.
Use the functions and variables in the library.
Link the library into the executable program.

These steps also apply when the object modules aren’t combined into a library. Including a header file and linking the object modules are the basic steps for separate compilation in both C and C++.

How the linker searches a library

When you make an external reference to a function or variable in C or C++, the linker, upon encountering this reference, can do one of two things. If it has not already encountered the definition for the function or variable, it adds the identifier to its list of “unresolved references .” If the linker has already encountered the definition, the reference is resolved.

If the linker cannot find the definition in the list of object modules, it searches the libraries. Libraries have some sort of indexing so the linker doesn’t need to look through all the object modules in the library – it just looks in the index. When the linker finds a definition in a library, the entire object module, not just the function definition, is linked into the executable program. Note that the whole library isn’t linked, just the object module in the library that contains the definition you want (otherwise programs would be unnecessarily large). If you want to minimize executable program size, you might consider putting a single function in each source code file when you build your own libraries. This requires more editing[27], but it can be helpful to the user.

Because the linker searches files in the order you give them, you can pre-empt the use of a library function by inserting a file with your own function, using the same function name, into the list before the library name appears. Since the linker will resolve any references to this function by using your function before it searches the library, your function is used instead of the library function. Note that this can also be a bug, and the kind of thing C++ namespaces prevent.

Secret additions

When a C or C++ executable program is created, certain items are secretly linked in. One of these is the startup module, which contains initialization routines that must be run any time a C or C++ program begins to execute. These routines set up the stack and initialize certain variables in the program.

The linker always searches the standard library for the compiled versions of any “standard” functions called in the program. Because the standard library is always searched, you can use anything in that library by simply including the appropriate header file in your program; you don’t have to tell it to search the standard library. The iostream functions, for example, are in the Standard C++ library. To use them, you just include the <iostream> header file.

If you are using an add-on library, you must explicitly add the library name to the list of files handed to the linker.

Using plain C libraries

Just because you are writing code in C++, you are not prevented from using C library functions. In fact, the entire C library is included by default into Standard C++. There has been a tremendous amount of work done for you in these functions, so they can save you a lot of time.

This book will use Standard C++ (and thus also Standard C) library functions when convenient, but only standard library functions will be used, to ensure the portability of programs. In the few cases in which library functions must be used that are not in the C++ standard, all attempts will be made to use POSIX-compliant functions. POSIX is a standard based on a Unix standardization effort that includes functions that go beyond the scope of the C++ library. You can generally expect to find POSIX functions on Unix (in particular, Linux) platforms, and often under DOS/Windows. For example, if you’re using multithreading you are better off using the POSIX thread library because your code will then be easier to understand, port and maintain (and the POSIX thread library will usually just use the underlying thread facilities of the operating system, if these are provided).

Your first C++ program

You now know almost enough of the basics to create and compile a program. The program will use the Standard C++ iostream classes. These read from and write to files and “standard” input and output (which normally comes from and goes to the console, but may be redirected to files or devices). In this simple program, a stream object will be used to print a message on the screen.

Using the iostreams class

To declare the functions and external data in the iostreams class, include the header file with the statement

#include <iostream>

The first program uses the concept of standard output, which means “a general-purpose place to send output.” You will see other examples using standard output in different ways, but here it will just go to the console. The iostream package automatically defines a variable (an object) called cout that accepts all data bound for standard output.

To send data to standard output, you use the operator <<. C programmers know this operator as the “bitwise left shift,” which will be described in the next chapter. Suffice it to say that a bitwise left shift has nothing to do with output. However, C++ allows operators to be overloaded. When you overload an operator , you give it a new meaning when that operator is used with an object of a particular type. With iostream objects, the operator << means “send to.” For example:

cout << "howdy!";

sends the string “howdy!” to the object called cout (which is short for “console output”).

That’s enough operator overloading to get you started. Chapter 12 covers operator overloading in detail.

Namespaces

As mentioned in Chapter 1, one of the problems encountered in the C language is that you “run out of names” for functions and identifiers when your programs reach a certain size. Of course, you don’t really run out of names; it does, however, become harder to think of new ones after awhile. More importantly, when a program reaches a certain size it’s typically broken up into pieces, each of which is built and maintained by a different person or group. Since C effectively has a single arena where all the identifier and function names live, this means that all the developers must be careful not to accidentally use the same names in situations where they can conflict. This rapidly becomes tedious, time-wasting, and, ultimately, expensive.

Standard C++ has a mechanism to prevent this collision: the namespace keyword. Each set of C++ definitions in a library or program is “wrapped” in a namespace, and if some other definition has an identical name, but is in a different namespace, then there is no collision.

Namespaces are a convenient and helpful tool, but their presence means that you must be aware of them before you can write any programs. If you simply include a header file and use some functions or objects from that header, you’ll probably get strange-sounding errors when you try to compile the program, to the effect that the compiler cannot find any of the declarations for the items that you just included in the header file! After you see this message a few times you’ll become familiar with its meaning (which is “You included the header file but all the declarations are within a namespace and you didn’t tell the compiler that you wanted to use the declarations in that namespace”).

There’s a keyword that allows you to say “I want to use the declarations and/or definitions in this namespace.” This keyword, appropriately enough, is using. All of the Standard C++ libraries are wrapped in a single namespace, which is std (for “standard”). As this book uses the standard libraries almost exclusively, you’ll see the following using directive in almost every program:

using namespace std;

This means that you want to expose all the elements from the namespace called std. After this statement, you don’t have to worry that your particular library component is inside a namespace, since the using directive makes that namespace available throughout the file where the using directive was written.

Exposing all the elements from a namespace after someone has gone to the trouble to hide them may seem a bit counterproductive, and in fact you should be careful about thoughtlessly doing this (as you’ll learn later in the book). However, the using directive exposes only those names for the current file, so it is not quite as drastic as it first sounds. (But think twice about doing it in a header file – that is reckless.)

There’s a relationship between namespaces and the way header files are included. Before the modern header file inclusion was standardized (without the trailing ‘.h’, as in <iostream>), the typical way to include a header file was with the ‘.h’, such as <iostream.h>. At that time, namespaces were not part of the language either. So to provide backward compatibility with existing code, if you say

#include <iostream.h>

it means

#include <iostream>
using namespace std;

However, in this book the standard include format will be used (without the ‘.h’) and so the using directive must be explicit.

For now, that’s all you need to know about namespaces, but in Chapter 10 the subject is covered much more thoroughly.

Fundamentals of program structure

A C or C++ program is a collection of variables, function definitions, and function calls. When the program starts, it executes initialization code and calls a special function, “main( ).” You put the primary code for the program here.

As mentioned earlier, a function definition consists of a return type (which must be specified in C++), a function name, an argument list in parentheses, and the function code contained in braces. Here is a sample function definition:

int function() {
  // Function code here (this is a comment)
}

The function above has an empty argument list and a body that contains only a comment.

There can be many sets of braces within a function definition, but there must always be at least one set surrounding the function body. Since main( ) is a function, it must follow these rules. In C++, main( ) always has return type of int.

C and C++ are free form languages. With few exceptions, the compiler ignores newlines and white space, so it must have some way to determine the end of a statement. Statements are delimited by semicolons.

C comments start with /* and end with */. They can include newlines. C++ uses C-style comments and has an additional type of comment: //. The // starts a comment that terminates with a newline. It is more convenient than /* */ for one-line comments, and is used extensively in this book.

"Hello, world!"

And now, finally, the first program:

//: C02:Hello.cpp
// Saying Hello with C++
#include <iostream> // Stream declarations
using namespace std;

int main() {
  cout << "Hello, World! I am "
       << 8 << " Today!" << endl;
} ///:~

The cout object is handed a series of arguments via the ‘<<’ operators. It prints out these arguments in left-to-right order. The special iostream function endl outputs the line and a newline. With iostreams, you can string together a series of arguments like this, which makes the class easy to use.

In C, text inside double quotes is traditionally called a “string.” However, the Standard C++ library has a powerful class called string for manipulating text, and so I shall use the more precise term character array for text inside double quotes.

The compiler creates storage for character arrays and stores the ASCII equivalent for each character in this storage. The compiler automatically terminates this array of characters with an extra piece of storage containing the value 0 to indicate the end of the character array.

Inside a character array, you can insert special characters by using escape sequences. These consist of a backslash (\) followed by a special code. For example \n means newline. Your compiler manual or local C guide gives a complete set of escape sequences; others include \t (tab), \\ (backslash), and \b (backspace).

Notice that the statement can continue over multiple lines, and that the entire statement terminates with a semicolon

Character array arguments and constant numbers are mixed together in the above cout statement. Because the operator << is overloaded with a variety of meanings when used with cout, you can send cout a variety of different arguments and it will “figure out what to do with the message.”

Throughout this book you’ll notice that the first line of each file will be a comment that starts with the characters that start a comment (typically //), followed by a colon, and the last line of the listing will end with a comment followed by ‘/:~’. This is a technique I use to allow easy extraction of information from code files (the program to do this can be found in volume two of this book, at www.BruceEckel.com). The first line also has the name and location of the file, so it can be referred to in text and in other files, and so you can easily locate it in the source code for this book (which is downloadable from www.BruceEckel.com).

Running the compiler

After downloading and unpacking the book’s source code, find the program in the subdirectory CO2. Invoke the compiler with Hello.cpp as the argument. For simple, one-file programs like this one, most compilers will take you all the way through the process. For example, to use the GNU C++ compiler (which is freely available on the Internet), you write:

g++ Hello.cpp

Other compilers will have a similar syntax; consult your compiler’s documentation for details.

More about iostreams

So far you have seen only the most rudimentary aspect of the iostreams class. The output formatting available with iostreams also includes features such as number formatting in decimal, octal, and hexadecimal. Here’s another example of the use of iostreams:

//: C02:Stream2.cpp
// More streams features
#include <iostream>
using namespace std;

int main() {
  // Specifying formats with manipulators:
  cout << "a number in decimal: "
       << dec << 15 << endl;
  cout << "in octal: " << oct << 15 << endl;
  cout << "in hex: " << hex << 15 << endl;
  cout << "a floating-point number: "
       << 3.14159 << endl;
  cout << "non-printing char (escape): "
       << char(27) << endl;
} ///:~

This example shows the iostreams class printing numbers in decimal, octal, and hexadecimal using iostream manipulators (which don’t print anything, but change the state of the output stream). The formatting of floating-point numbers is determined automatically by the compiler. In addition, any character can be sent to a stream object using a cast to a char (a char is a data type that holds single characters). This cast looks like a function call: char( ), along with the character’s ASCII value. In the program above, the char(27) sends an “escape” to cout.

Character array concatenation

An important feature of the C preprocessor is character array concatenation. This feature is used in some of the examples in this book. If two quoted character arrays are adjacent, and no punctuation is between them, the compiler will paste the character arrays together into a single character array. This is particularly useful when code listings have width restrictions:

//: C02:Concat.cpp
// Character array Concatenation
#include <iostream>
using namespace std;

int main() {
  cout << "This is far too long to put on a "
    "single line but it can be broken up with "
    "no ill effects\nas long as there is no "
    "punctuation separating adjacent character "
    "arrays.\n";
} ///:~

At first, the code above can look like an error because there’s no familiar semicolon at the end of each line. Remember that C and C++ are free-form languages, and although you’ll usually see a semicolon at the end of each line, the actual requirement is for a semicolon at the end of each statement, and it’s possible for a statement to continue over several lines.

Reading input

The iostreams classes provide the ability to read input. The object used for standard input is cin (for “console input”). cin normally expects input from the console, but this input can be redirected from other sources. An example of redirection is shown later in this chapter.

The iostreams operator used with cin is >>. This operator waits for the same kind of input as its argument. For example, if you give it an integer argument, it waits for an integer from the console. Here’s an example:

//: C02:Numconv.cpp
// Converts decimal to octal and hex
#include <iostream>
using namespace std;

int main() {
  int number;
  cout << "Enter a decimal number: ";
  cin >> number;
  cout << "value in octal = 0" 
       << oct << number << endl;
  cout << "value in hex = 0x" 
       << hex << number << endl;
} ///:~

This program converts a number typed in by the user into octal and hexadecimal representations.

Calling other programs

While the typical way to use a program that reads from standard input and writes to standard output is within a Unix shell script or DOS batch file, any program can be called from inside a C or C++ program using the Standard C system( ) function, which is declared in the header file <cstdlib>:

//: C02:CallHello.cpp
// Call another program
#include <cstdlib> // Declare "system()"
using namespace std;

int main() {
  system("Hello");
} ///:~

To use the system( ) function, you give it a character array that you would normally type at the operating system command prompt. This can also include command-line arguments, and the character array can be one that you fabricate at run time (instead of just using a static character array as shown above). The command executes and control returns to the program.

This program shows you how easy it is to use plain C library functions in C++; just include the header file and call the function. This upward compatibility from C to C++ is a big advantage if you are learning the language starting from a background in C.

Introducing strings

While a character array can be fairly useful, it is quite limited. It’s simply a group of characters in memory, but if you want to do anything with it you must manage all the little details. For example, the size of a quoted character array is fixed at compile time. If you have a character array and you want to add some more characters to it, you’ll need to understand quite a lot (including dynamic memory management, character array copying, and concatenation) before you can get your wish. This is exactly the kind of thing we’d like to have an object do for us.

The Standard C++ string class is designed to take care of (and hide) all the low-level manipulations of character arrays that were previously required of the C programmer. These manipulations have been a constant source of time-wasting and errors since the inception of the C language. So, although an entire chapter is devoted to the string class in Volume 2 of this book, the string is so important and it makes life so much easier that it will be introduced here and used in much of the early part of the book.

To use strings you include the C++ header file <string>. The string class is in the namespace std so a using directive is necessary. Because of operator overloading, the syntax for using strings is quite intuitive:

//: C02:HelloStrings.cpp
// The basics of the Standard C++ string class
#include <string>
#include <iostream>
using namespace std;

int main() {
  string s1, s2; // Empty strings
  string s3 = "Hello, World."; // Initialized
  string s4("I am"); // Also initialized
  s2 = "Today"; // Assigning to a string
  s1 = s3 + " " + s4; // Combining strings
  s1 += " 8 "; // Appending to a string
  cout << s1 + s2 + "!" << endl;
} ///:~

The first two strings, s1 and s2, start out empty, while s3 and s4 show two equivalent ways to initialize string objects from character arrays (you can just as easily initialize string objects from other string objects).

You can assign to any string object using ‘=’. This replaces the previous contents of the string with whatever is on the right-hand side, and you don’t have to worry about what happens to the previous contents – that’s handled automatically for you. To combine strings you simply use the ‘+’ operator, which also allows you to combine character arrays with strings. If you want to append either a string or a character array to another string, you can use the operator ‘+=’. Finally, note that iostreams already know what to do with strings, so you can just send a string (or an expression that produces a string, which happens with s1 + s2 + "!") directly to cout in order to print it.

Reading and writing files

In C, the process of opening and manipulating files requires a lot of language background to prepare you for the complexity of the operations. However, the C++ iostream library provides a simple way to manipulate files, and so this functionality can be introduced much earlier than it would be in C.

To open files for reading and writing, you must include <fstream>. Although this will automatically include <iostream>, it’s generally prudent to explicitly include <iostream> if you’re planning to use cin, cout, etc.

To open a file for reading, you create an ifstream object, which then behaves like cin. To open a file for writing, you create an ofstream object, which then behaves like cout. Once you’ve opened the file, you can read from it or write to it just as you would with any other iostream object. It’s that simple (which is, of course, the whole point).

One of the most useful functions in the iostream library is getline( ), which allows you to read one line (terminated by a newline) into a string object[28]. The first argument is the ifstream object you’re reading from and the second argument is the string object. When the function call is finished, the string object will contain the line.

Here’s a simple example, which copies the contents of one file into another:

//: C02:Scopy.cpp
// Copy one file to another, a line at a time
#include <string>
#include <fstream>
using namespace std;

int main() {
  ifstream in("Scopy.cpp"); // Open for reading
  ofstream out("Scopy2.cpp"); // Open for writing
  string s;
  while(getline(in, s)) // Discards newline char
    out << s << "\n"; // ... must add it back
} ///:~

To open the files, you just hand the ifstream and ofstream objects the file names you want to create, as seen above.

There is a new concept introduced here, which is the while loop. Although this will be explained in detail in the next chapter, the basic idea is that the expression in parentheses following the while controls the execution of the subsequent statement (which can also be multiple statements, wrapped inside curly braces). As long as the expression in parentheses (in this case, getline(in, s)) produces a “true” result, then the statement controlled by the while will continue to execute. It turns out that getline( ) will return a value that can be interpreted as “true” if another line has been read successfully, and “false” upon reaching the end of the input. Thus, the above while loop reads every line in the input file and sends each line to the output file.

getline( ) reads in the characters of each line until it discovers a newline (the termination character can be changed, but that won’t be an issue until the iostreams chapter in Volume 2). However, it discards the newline and doesn’t store it in the resulting string object. Thus, if we want the copied file to look just like the source file, we must add the newline back in, as shown.

Another interesting example is to copy the entire file into a single string object:

//: C02:FillString.cpp
// Read an entire file into a single string
#include <string>
#include <iostream>
#include <fstream>
using namespace std;

int main() {
  ifstream in("FillString.cpp");
  string s, line;
  while(getline(in, line))
    s += line + "\n";
  cout << s;
} ///:~

Because of the dynamic nature of strings, you don’t have to worry about how much storage to allocate for a string; you can just keep adding things and the string will keep expanding to hold whatever you put into it.

One of the nice things about putting an entire file into a string is that the string class has many functions for searching and manipulation that would then allow you to modify the file as a single string. However, this has its limitations. For one thing, it is often convenient to treat a file as a collection of lines instead of just a big blob of text. For example, if you want to add line numbering it’s much easier if you have each line as a separate string object. To accomplish this, we’ll need another approach.

Introducing vector

With strings, we can fill up a string object without knowing how much storage we’re going to need. The problem with reading lines from a file into individual string objects is that you don’t know up front how many strings you’re going to need – you only know after you’ve read the entire file. To solve this problem, we need some sort of holder that will automatically expand to contain as many string objects as we care to put into it.

In fact, why limit ourselves to holding string objects? It turns out that this kind of problem – not knowing how many of something you have while you’re writing a program – happens a lot. And this “container” object sounds like it would be more useful if it would hold any kind of object at all! Fortunately, the Standard C++ Library has a ready-made solution: the standard container classes. The container classes are one of the real powerhouses of Standard C++.

There is often a bit of confusion between the containers and algorithms in the Standard C++ Library, and the entity known as the STL. The Standard Template Library was the name Alex Stepanov (who was working at Hewlett-Packard at the time) used when he presented his library to the C++ Standards Committee at the meeting in San Diego, California in Spring 1994. The name stuck, especially after HP decided to make it available for public downloads. Meanwhile, the committee integrated it into the Standard C++ Library, making a large number of changes. STL's development continues at Silicon Graphics (SGI; see http://www.sgi.com/Technology/STL). The SGI STL diverges from the Standard C++ Library on many subtle points. So although it's a popular misconception, the C++ Standard does not “include” the STL. It can be a bit confusing since the containers and algorithms in the Standard C++ Library have the same root (and usually the same names) as the SGI STL. In this book, I will say “The Standard C++ Library” or “The Standard Library containers,” or something similar and will avoid the term “STL.”

Even though the implementation of the Standard C++ Library containers and algorithms uses some advanced concepts and the full coverage takes two large chapters in Volume 2 of this book, this library can also be potent without knowing a lot about it. It’s so useful that the most basic of the standard containers, the vector, is introduced in this early chapter and used throughout the book. You’ll find that you can do a tremendous amount just by using the basics of vector and not worrying about the underlying implementation (again, an important goal of OOP). Since you’ll learn much more about this and the other containers when you reach the Standard Library chapters in Volume 2, it seems forgivable if the programs that use vector in the early portion of the book aren’t exactly what an experienced C++ programmer would do. You’ll find that in most cases, the usage shown here is adequate.

The vector class is a template, which means that it can be efficiently applied to different types. That is, we can create a vector of shapes, a vector of cats, a vector of strings, etc. Basically, with a template you can create a “class of anything.” To tell the compiler what it is that the class will work with (in this case, what the vector will hold), you put the name of the desired type in “angle brackets,” which means ‘<’ and ‘>’. So a vector of string would be denoted vector<string>. When you do this, you end up with a customized vector that will hold only string objects, and you’ll get an error message from the compiler if you try to put anything else into it.

Since vector expresses the concept of a “container,” there must be a way to put things into the container and get things back out of the container. To add a brand-new element on the end of a vector, you use the member function push_back( ). (Remember that, since it’s a member function, you use a ‘.’ to call it for a particular object.) The reason the name of this member function might seem a bit verbose – push_back( ) instead of something simpler like “put” – is because there are other containers and other member functions for putting new elements into containers. For example, there is an insert( ) member function to put something in the middle of a container. vector supports this but its use is more complicated and we won’t need to explore it until Volume 2 of the book. There’s also a push_front( ) (not part of vector) to put things at the beginning. There are many more member functions in vector and many more containers in the Standard C++ Library, but you’ll be surprised at how much you can do just knowing about a few simple features.

So you can put new elements into a vector with push_back( ), but how do you get these elements back out again? This solution is more clever and elegant – operator overloading is used to make the vector look like an array. The array (which will be described more fully in the next chapter) is a data type that is available in virtually every programming language so you should already be somewhat familiar with it. Arrays are aggregates, which mean they consist of a number of elements clumped together. The distinguishing characteristic of an array is that these elements are the same size and are arranged to be one right after the other. Most importantly, these elements can be selected by “indexing,” which means you can say “I want element number n” and that element will be produced, usually quickly. Although there are exceptions in programming languages, the indexing is normally achieved using square brackets, so if you have an array a and you want to produce element five, you say a[4] (note that indexing always starts at zero).

This very compact and powerful indexing notation is incorporated into the vector using operator overloading, just like ‘<<’ and ‘>>’ were incorporated into iostreams. Again, you don’t need to know how the overloading was implemented – that’s saved for a later chapter – but it’s helpful if you’re aware that there’s some magic going on under the covers in order to make the [ ] work with vector.

With that in mind, you can now see a program that uses vector. To use a vector, you include the header file <vector>:

//: C02:Fillvector.cpp
// Copy an entire file into a vector of string
#include <string>
#include <iostream>
#include <fstream>
#include <vector>
using namespace std;

int main() {
  vector<string> v;
  ifstream in("Fillvector.cpp");
  string line;
  while(getline(in, line))
    v.push_back(line); // Add the line to the end
  // Add line numbers:
  for(int i = 0; i < v.size(); i++)
    cout << i << ": " << v[i] << endl;
} ///:~

Much of this program is similar to the previous one; a file is opened and lines are read into string objects one at a time. However, these string objects are pushed onto the back of the vector v. Once the while loop completes, the entire file is resident in memory, inside v.

The next statement in the program is called a for loop. It is similar to a while loop except that it adds some extra control. After the for, there is a “control expression” inside of parentheses, just like the while loop. However, this control expression is in three parts: a part which initializes, one that tests to see if we should exit the loop, and one that changes something, typically to step through a sequence of items. This program shows the for loop in the way you’ll see it most commonly used: the initialization part int i = 0 creates an integer i to use as a loop counter and gives it an initial value of zero. The testing portion says that to stay in the loop, i should be less than the number of elements in the vector v. (This is produced using the member function size( ), which I just sort of slipped in here, but you must admit it has a fairly obvious meaning.) The final portion uses a shorthand for C and C++, the “auto-increment” operator, to add one to the value of i. Effectively, i++ says “get the value of i, add one to it, and put the result back into i. Thus, the total effect of the for loop is to take a variable i and march it through the values from zero to one less than the size of the vector. For each value of i, the cout statement is executed and this builds a line that consists of the value of i (magically converted to a character array by cout), a colon and a space, the line from the file, and a newline provided by endl. When you compile and run it you’ll see the effect is to add line numbers to the file.

Because of the way that the ‘>>’ operator works with iostreams, you can easily modify the program above so that it breaks up the input into whitespace-separated words instead of lines:

//: C02:GetWords.cpp
// Break a file into whitespace-separated words
#include <string>
#include <iostream>
#include <fstream>
#include <vector>
using namespace std;

int main() {
  vector<string> words;
  ifstream in("GetWords.cpp");
  string word;
  while(in >> word)
    words.push_back(word); 
  for(int i = 0; i < words.size(); i++)
    cout << words[i] << endl;
} ///:~

The expression

while(in >> word)

is what gets the input one “word” at a time, and when this expression evaluates to “false” it means the end of the file has been reached. Of course, delimiting words by whitespace is quite crude, but it makes for a simple example. Later in the book you’ll see more sophisticated examples that let you break up input just about any way you’d like.

To demonstrate how easy it is to use a vector with any type, here’s an example that creates a vector<int>:

//: C02:Intvector.cpp
// Creating a vector that holds integers
#include <iostream>
#include <vector>
using namespace std;

int main() {
  vector<int> v;
  for(int i = 0; i < 10; i++)
    v.push_back(i);
  for(int i = 0; i < v.size(); i++)
    cout << v[i] << ", ";
  cout << endl;
  for(int i = 0; i < v.size(); i++)
    v[i] = v[i] * 10; // Assignment  
  for(int i = 0; i < v.size(); i++)
    cout << v[i] << ", ";
  cout << endl;
} ///:~

To create a vector that holds a different type, you just put that type in as the template argument (the argument in angle brackets). Templates and well-designed template libraries are intended to be exactly this easy to use.

This example goes on to demonstrate another essential feature of vector. In the expression

v[i] = v[i] * 10;

you can see that the vector is not limited to only putting things in and getting things out. You also have the ability to assign (and thus to change) to any element of a vector, also through the use of the square-brackets indexing operator. This means that vector is a general-purpose, flexible “scratchpad” for working with collections of objects, and we will definitely make use of it in coming chapters.

Summary

The intent of this chapter is to show you how easy object-oriented programming can be – if someone else has gone to the work of defining the objects for you. In that case, you include a header file, create the objects, and send messages to them. If the types you are using are powerful and well-designed, then you won’t have to do much work and your resulting program will also be powerful.

In the process of showing the ease of OOP when using library classes, this chapter also introduced some of the most basic and useful types in the Standard C++ library: the family of iostreams (in particular, those that read from and write to the console and files), the string class, and the vector template. You’ve seen how straightforward it is to use these and can now probably imagine many things you can accomplish with them, but there’s actually a lot more that they’re capable of[29]. Even though we’ll only be using a limited subset of the functionality of these tools in the early part of the book, they nonetheless provide a large step up from the primitiveness of learning a low-level language like C. and while learning the low-level aspects of C is educational, it’s also time consuming. In the end, you’ll be much more productive if you’ve got objects to manage the low-level issues. After all, the whole point of OOP is to hide the details so you can “paint with a bigger brush.”

However, as high-level as OOP tries to be, there are some fundamental aspects of C that you can’t avoid knowing, and these will be covered in the next chapter.

Exercises

Solutions to selected exercises can be found in the electronic document The Thinking in C++ Annotated Solution Guide, available for a small fee from http://www.BruceEckel.com

Modify Hello.cpp so that it prints out your name and age (or shoe size, or your dog’s age, if that makes you feel better). Compile and run the program.
Using Stream2.cpp and Numconv.cpp as guidelines, create a program that asks for the radius of a circle and prints the area of that circle. You can just use the ‘*’ operator to square the radius. Do not try to print out the value as octal or hex (these only work with integral types).
Create a program that opens a file and counts the whitespace-separated words in that file.
Create a program that counts the occurrence of a particular word in a file (use the string class’ operator ‘==’ to find the word).
Change Fillvector.cpp so that it prints the lines (backwards) from last to first.
Change Fillvector.cpp so that it concatenates all the elements in the vector into a single string before printing it out, but don’t try to add line numbering.
Display a file a line at a time, waiting for the user to press the “Enter” key after each line.
Create a vector<float> and put 25 floating-point numbers into it using a for loop. Display the vector.
Create three vector<float> objects and fill the first two as in the previous exercise. Write a for loop that adds each corresponding element in the first two vectors and puts the result in the corresponding element of the third vector. Display all three vectors.
Create a vector<float> and put 25 numbers into it as in the previous exercises. Now square each number and put the result back into the same location in the vector. Display the vector before and after the multiplication.

[25] The boundary between compilers and interpreters can tend to become a bit fuzzy, especially with Python, which has many of the features and power of a compiled language but the quick turnaround of an interpreted language.

[26] Python is again an exception, since it also provides separate compilation.

[27] I would recommend using Perl or Python to automate this task as part of your library-packaging process (see www.Perl.org or www.Python.org).

[28] There are actually a number of variants of getline( ), which will be discussed thoroughly in the iostreams chapter in Volume 2.

[29] If you’re particularly eager to see all the things that can be done with these and other Standard library components, see Volume 2 of this book at www.BruceEckel.com, and also www.dinkumware.com.

[ Previous Chapter ] [ Table of Contents ] [ Index ] [ Next Chapter ]
Last Update:09/27/2001