Introduction to Programming and C++

Contents No Previous Appendix Next Appendix

Almost every book on programming will give you a list of ASCII characters, and almost none of them will bother giving you any clue as to why. Hopefully, a little bit of colour helps clear this up.

These days, the logical unit for a small amount of information is the byte: each byte stores 8 = 23 bits and is capable of storing a number between 0 and 255. In the 1960s, when space was at much more of a premium, it was recognized that most communications of the English language could be performed with 128 characters, or seven bits. Thus, the ASCII characters occupy the first 128 characters. The remaining 128 characters vary, but often encode special symbols and accented letters.

We will break the ASCII letters into four groups:

  • 0-31 Control characters
  • 32-63 Printable characters (with embedded decimal digits)
  • 64-95 Printable characters (with embedded capital letters)
  • 96-127 Printable characters (with embedded lower-case letters)

Special room is made for each of three sets: the decimal digits 0-9, the capital letters A-Z, and the lower-case letters a-z. All other symbols fill in the gaps around these. The characters on the keyboard today are a direct consequence of those characters which can be printed using ASCII.

Each of these examined in the next four sections followed by a summary.

Control Characters (0-31)

The first 32 ASCII characters are control characters, that is, they are used to control printing devices, including such instructions such as, for example, to indicate the end of a line, to signal a tabl, or an instruction to ring a bell.

If the first three bits are 000, this indicates that ASCII character is control character.

Note: There is one more control character: delete is 0111 1111 or 7F.

Binary DecHexGlyph
0000 0000000Null character (NUL)
0000 0001101Start of Header (SOH)
0000 0010202Start of Text (STX)
0000 0011303End of Text (ETX)
0000 0100404End of Transmission (EOT)
0000 0101505Enquiry (ENQ)
0000 0110606Acknowledgment (ACK)
0000 0111707Bell (BEL)
0000 1000808Backspace (BS)
0000 1001909Horizontal Tab (HT)
0000 1010100ALine feed (LF)
0000 1011110BVertical Tab (VT)
0000 1100120CForm feed (FF)
0000 1101130DCarriage return (CR)
0000 1110140EShift Out (SO)
0000 1111150FShift In (SI)
0001 00001610Data Link Escape (DLE)
0001 00011711Device Control 1 (DC1)
0001 00101812Device Control 2 (DC2)
0001 00111913Device Control 3 (DC3)
0001 01002014Device Control 4 (DC4)
0001 01012115Negative Acknowledgement (NAK)
0001 01102216Synchronous Idle (SYN)
0001 01112317End of Transmission Block (ETB)
0001 10002418Cancel (CAN)
0001 10012519End of Medium (EM)
0001 1010261ASubstitute (SUB)
0001 1011271BEscape (ESC)
0001 1100281CFile Separator (FS)
0001 1101291DGroup Separator (GS)
0001 1110301ERecord Separator (RS)
0001 1111311FUnit Separator (US)



Printable Characters I (32-63)

All but one of the remaining characters are printable. Embedded within the first 32 of these are the decimal digits 0 through 9. These decimal digits begin with the four bits 0011 and if the character is a decimal digit (0011 0000 through 0011 1001), it can be converted to an integer by replacing the leading 0011 with 0000.

Binary DecHexGlyph
0010 00003220(blank)
0010 00013321!
0010 00103422"
0010 00113523#
0010 01003624$
0010 01013725%
0010 01103826&
0010 01113927'
0010 10004028(
0010 10014129)
0010 1010422A*
0010 1011432B+
0010 1100442C,
0010 1101452D-
0010 1110462E.
0010 1111472F/
0011 000048300
0011 000149311
0011 001050322
0011 001151333
0011 010052344
0011 010153355
0011 011054366
0011 011155377
0011 100056388
0011 100157399
0011 1010583A:
0011 1011593B;
0011 1100603C<
0011 1101613D=
0011 1110623E>
0011 1111633F?



Printable Characters II (64-95)

Embedded within the next 32 characters embed the capital letters A through Z. These letters begin with the three bits 010 and if the character is a letter (0100 0001 through 0101 1010), it can be converted into the integer representing the position of that letter in the alphabet by replacing the out the leading 010 with 000.

For example, you will note that W is encoded using 101112 = 2310 and is the 23rd letter in the alphabet.

Binary DecHexGlyph
0100 00006440@
0100 00016541A
0100 00106642B
0100 00116743C
0100 01006844D
0100 01016945E
0100 01107046F
0100 01117147G
0100 10007248H
0100 10017349I
0100 1010744AJ
0100 1011754BK
0100 1100764CL
0100 1101774DM
0100 1110784EN
0100 1111794FO
0101 00008050P
0101 00018151Q
0101 00108252R
0101 00118353S
0101 01008454T
0101 01018555U
0101 01108656V
0101 01118757W
0101 10008858X
0101 10018959Y
0101 1010905AZ
0101 1011915B[
0101 1100925C\
0101 1101935D]
0101 1110945E^
0101 1111955F_



Printable Characters III (96-127)

Embedded within the last 32 characters embed the lower-case letters a through z. These letters begin with the three bits 011 and if the character is a letter (0110 0001 through 0111 1010), it can be converted into the integer representing the position of that letter in the alphabet by replacing the out the leading 011 with 000.

For example, you will note that w is coded using 101112 = 2310 and is the 23rd letter in the alphabet.

Binary DecHexGlyph
0110 00009660`
0110 00019761a
0110 00109862b
0110 00119963c
0110 010010064d
0110 010110165e
0110 011010266f
0110 011110367g
0110 100010468h
0110 100110569i
0110 10101066Aj
0110 10111076Bk
0110 11001086Cl
0110 11011096Dm
0110 11101106En
0110 11111116Fo
0111 000011270p
0111 000111371q
0111 001011472r
0111 001111573s
0111 010011674t
0111 010111775u
0111 011011876v
0111 011111977w
0111 100012078x
0111 100112179y
0111 10101227Az
0111 10111237B{
0111 11001247C|
0111 11011257D}
0111 11101267E~
0111 11111277FDelete (DEL)



Summary

The general positioning of the various control and printable characters are summarized in Figure 1.

Figure 1. The general layout of ASCII characters.


Contents No Previous Appendix Top Next Appendix