Up to this point, we have looked at a single program that can be compiled into one that, when run, prints Hello world! to the console, and we described strings: arbitrary sequences of characters surrounded by double quotes that store text or messages to be printed to the screen.
In the Hello world! program, however, there is are other names, or sequences of alphabetic characters that seem to be related to some form of information, including:
int, main, std, cout, endl, and return.
For now, we will ignore the rather peculiar first line that is prefixed by a pound symbol #.
These names are used by the programmer to refer to the names of variables, the names of functions, namespaces, and many other objects. An identifier is always started with either an alphabetic character or an underscore (_) followed by an arbitrary number of alphanumeric characters or underscores.
Definition: alphanumeric characters
Any letter of the alphabet (A, B, ..., Z, a, b, ..., z) or a numeric digit (0, 1, ..., 9). That is, [A-Za-z0-9].
A very small set of identifiers are so important to the C++ programming language that they are explicitly reserved by the language for one purpose, and one purpose only. These are said to be keywords. The two keywords in our function so far are int and return:
The regular expression representation of an identifier is
[_a-zA-Z][_a-zA-Z0-9]*
Two identifiers are equal if all the characters in the identifiers are exactly the same.
Identifiers are always case sensitive, meaning that if the case of a letter differs, then the identifiers are different, so a0 and A0 are two unique identifiers.
Identifiers are often created by stringing together words that describe what the identifier is supposed to represent. There are different ways of stringing words together, and different projects or teams will use different approaches:
Description | snake-case | camel-case | juxtaposition |
---|---|---|---|
linked list | linked_list | LinkedList or linkedList | linkedlist |
is an array sorted | is_sorted | IsSorted or isSorted | issorted |
array capacity | array_capacity | ArrayCapacity or arrayCapacity | arraycapacity |
Each of these are described as a naming convention, and different employers will have different choices of naming conventions. As an employee, it is your responsibility to adopt the naming convention of the company from whom you are working.
In order to create a functioning compiler, it is sometimes necessary for the compiler to use identifiers behind the scene. As different compilers are produced by different companies or groups, a convention has been adopted for identifiers reserved for use by the compiler:
Thus, _name and ECE__150 are reserved identifiers. You can use these identifiers, and your code may still compile, but if there is an update to the compiler or if you attempt to compile your code on a different compiler, your program may fail and it may be very difficult to determine the cause of the failure.
General rule: reserved identifiers
Never start an identifier with an underscore, and never have two
adjacent underscores in an identifier.
There are some identifiers that are used by the programming language as part of the language description or grammar. These identifiers can never be used for any other purpose other than those for the intended purpose specified by the programming language. In general, most programming languages attempt to minimize the number of keywords.
To date, we have seen two keywords: int and return. The first is used to specify the type that is returned (in this case, main() is a function that returns an integer), and the second is used specify the exiting of a function and what is being returned (in the case we have looked at, 0.
You can see a complete list of keywords, but these are the keywords that are also English words:
and auto break case catch class concept continue default delete do double else explicit export false for friend if long mutable new not operator or private protected public register requires return short signed static switch synchronized template this throw true try union unsigned using virtual void volatile while
If you find yourself using one of these keywords as an identifier, your code will not compile, but it may be difficult to determine what exactly the problem is.
Additional keywords that we may or will see in this course include:
bool char const float goto inline int namespace nullptr sizeof struct typename
Important reminder
Do not attempt to memorize all of these keywords: you would be wasting your time.
Instead, as we go through the course, you will learn those keywords that are
relevant to this course. Neither course author has memorized all keywords.
Over time, you will learn the most relevant keywords in the context
that they are used.
A comment on main: The identifier main is not a keyword, but it is an identifier that has special significance in the C++ programming language. If a source file contains a function declared as int main(), then if that function is compiled, it will result in an executable, and when that executable is run, it will start executing the main function. Later we will see that we can define other functions, as well.
In your own words, describe each of these concepts:
1. What error do you get if you misspell return?
2. What happens if you misspell the function name main, say with min?
3. If you misspelled include, you get a cascade of subsequent errors. If there are multiple errors, always try to fix the first error first; this may eliminate subsequent errors.
4. If you write int main(] { instead of using a closing parenthesis, what are the errors? How is the error worded?