Skip to the content of the web site.

Lesson 1.4: Identifiers

Previous lesson Next lesson


Up to this point, we have looked at a single program that can be compiled into one that, when run, prints Hello world! to the console, and we described strings: arbitrary sequences of characters surrounded by double quotes that store text or messages to be printed to the screen.

In the Hello world! program, however, there is are other names, or sequences of alphabetic characters that seem to be related to some form of information, including:

int, main, std, cout, endl, and return.

For now, we will ignore the rather peculiar first line that is prefixed by a pound symbol #.

These names are used by the programmer to refer to the names of variables, the names of functions, namespaces, and many other objects. An identifier is always started with either an alphabetic character or an underscore (_) followed by an arbitrary number of alphanumeric characters or underscores.

Definition: alphanumeric characters
Any letter of the alphabet (A, B, ..., Z, a, b, ..., z) or a numeric digit (0, 1, ..., 9). That is, [A-Za-z0-9].

A very small set of identifiers are so important to the C++ programming language that they are explicitly reserved by the language for one purpose, and one purpose only. These are said to be keywords. The two keywords in our function so far are int and return:

  1. The first, int is a way of saying integer, and in this case, the function main is declared to return an integer.
  2. Related to this, the very last line of the block of statements associated with the main function is a line that says return 0;. This says that when the main function finishes executing, it returns the value 0 (which is an integer).

The regular expression representation of an identifier is

                 [_a-zA-Z][_a-zA-Z0-9]*

Two identifiers are equal if all the characters in the identifiers are exactly the same.

Identifiers are always case sensitive, meaning that if the case of a letter differs, then the identifiers are different, so a0 and A0 are two unique identifiers.

Naming conventions

Identifiers are often created by stringing together words that describe what the identifier is supposed to represent. There are different ways of stringing words together, and different projects or teams will use different approaches:

Descriptionsnake-casecamel-casejuxtaposition
linked listlinked_listLinkedList or linkedListlinkedlist
is an array sortedis_sortedIsSorted or isSortedissorted
array capacityarray_capacityArrayCapacity or arrayCapacityarraycapacity

Each of these are described as a naming convention, and different employers will have different choices of naming conventions. As an employee, it is your responsibility to adopt the naming convention of the company from whom you are working.

Reserved identifiers

In order to create a functioning compiler, it is sometimes necessary for the compiler to use identifiers behind the scene. As different compilers are produced by different companies or groups, a convention has been adopted for identifiers reserved for use by the compiler:

  1. any identifier starting with an underscore, and
  2. any identifier with two adjacent underscores.

Thus, _name and ECE__150 are reserved identifiers. You can use these identifiers, and your code may still compile, but if there is an update to the compiler or if you attempt to compile your code on a different compiler, your program may fail and it may be very difficult to determine the cause of the failure.

General rule: reserved identifiers
Never start an identifier with an underscore, and never have two adjacent underscores in an identifier.

Keywords

There are some identifiers that are used by the programming language as part of the language description or grammar. These identifiers can never be used for any other purpose other than those for the intended purpose specified by the programming language. In general, most programming languages attempt to minimize the number of keywords.

To date, we have seen two keywords: int and return. The first is used to specify the type that is returned (in this case, main() is a function that returns an integer), and the second is used specify the exiting of a function and what is being returned (in the case we have looked at, 0.

You can see a complete list of keywords, but these are the keywords that are also English words:

and          auto         break        case         catch        class        concept
continue     default      delete       do           double       else         explicit
export       false        for          friend       if           long         mutable
new          not          operator     or           private      protected    public
register     requires     return       short        signed       static       switch
synchronized template     this         throw        true         try          union
unsigned     using        virtual      void         volatile     while

If you find yourself using one of these keywords as an identifier, your code will not compile, but it may be difficult to determine what exactly the problem is.

Additional keywords that we may or will see in this course include:

bool         char         const        float        goto         inline
int          namespace    nullptr      sizeof       struct       typename

Important reminder
Do not attempt to memorize all of these keywords: you would be wasting your time. Instead, as we go through the course, you will learn those keywords that are relevant to this course. Neither course author has memorized all keywords. Over time, you will learn the most relevant keywords in the context that they are used.

A comment on main: The identifier main is not a keyword, but it is an identifier that has special significance in the C++ programming language. If a source file contains a function declared as int main(), then if that function is compiled, it will result in an executable, and when that executable is run, it will start executing the main function. Later we will see that we can define other functions, as well.

Review

In your own words, describe each of these concepts:

Questions and practice:

1. What error do you get if you misspell return?

2. What happens if you misspell the function name main, say with min?

3. If you misspelled include, you get a cascade of subsequent errors. If there are multiple errors, always try to fix the first error first; this may eliminate subsequent errors.

4. If you write int main(] { instead of using a closing parenthesis, what are the errors? How is the error worded?

Solutions.


Previous lesson Next lesson