Almost every book on programming will give you a list of ASCII characters, and almost none of them will bother giving you any clue as to why. Hopefully, a little bit of colour helps clear this up.
These days, the logical unit for a small amount of information is the byte: each byte stores 8 = 23 bits and is capable of storing a number between 0 and 255. In the 1960s, when space was at much more of a premium, it was recognized that most communications of the English language could be performed with 128 characters, or seven bits. Thus, the ASCII characters occupy the first 128 characters. The remaining 128 characters vary, but often encode special symbols and accented letters.
We will break the ASCII letters into four groups:
Special room is made for each of three sets: the decimal digits 0-9, the capital letters A-Z, and the lower-case letters a-z. All other symbols fill in the gaps around these. The characters on the keyboard today are a direct consequence of those characters which can be printed using ASCII.
Each of these examined in the next four sections followed by a summary.
The first 32 ASCII characters are control characters, that is, they are used to control printing devices, including such instructions such as, for example, to indicate the end of a line, to signal a tabl, or an instruction to ring a bell.
If the first three bits are 000, this indicates that ASCII character is control character.
Note: There is one more control character: delete is 0111 1111 or 7F.
Binary | Dec | Hex | Glyph |
---|---|---|---|
0000 0000 | 0 | 00 | Null character (NUL) |
0000 0001 | 1 | 01 | Start of Header (SOH) |
0000 0010 | 2 | 02 | Start of Text (STX) |
0000 0011 | 3 | 03 | End of Text (ETX) |
0000 0100 | 4 | 04 | End of Transmission (EOT) |
0000 0101 | 5 | 05 | Enquiry (ENQ) |
0000 0110 | 6 | 06 | Acknowledgment (ACK) |
0000 0111 | 7 | 07 | Bell (BEL) |
0000 1000 | 8 | 08 | Backspace (BS) |
0000 1001 | 9 | 09 | Horizontal Tab (HT) |
0000 1010 | 10 | 0A | Line feed (LF)|
0000 1011 | 11 | 0B | Vertical Tab (VT)|
0000 1100 | 12 | 0C | Form feed (FF) |
0000 1101 | 13 | 0D | Carriage return (CR) |
0000 1110 | 14 | 0E | Shift Out (SO) |
0000 1111 | 15 | 0F | Shift In (SI) |
0001 0000 | 16 | 10 | Data Link Escape (DLE) |
0001 0001 | 17 | 11 | Device Control 1 (DC1) |
0001 0010 | 18 | 12 | Device Control 2 (DC2) |
0001 0011 | 19 | 13 | Device Control 3 (DC3) |
0001 0100 | 20 | 14 | Device Control 4 (DC4) |
0001 0101 | 21 | 15 | Negative Acknowledgement (NAK) |
0001 0110 | 22 | 16 | Synchronous Idle (SYN) |
0001 0111 | 23 | 17 | End of Transmission Block (ETB) |
0001 1000 | 24 | 18 | Cancel (CAN) |
0001 1001 | 25 | 19 | End of Medium (EM) |
0001 1010 | 26 | 1A | Substitute (SUB) |
0001 1011 | 27 | 1B | Escape (ESC) |
0001 1100 | 28 | 1C | File Separator (FS) |
0001 1101 | 29 | 1D | Group Separator (GS) |
0001 1110 | 30 | 1E | Record Separator (RS) |
0001 1111 | 31 | 1F | Unit Separator (US) |
All but one of the remaining characters are printable. Embedded within the first 32 of these are the decimal digits 0 through 9. These decimal digits begin with the four bits 0011 and if the character is a decimal digit (0011 0000 through 0011 1001), it can be converted to an integer by replacing the leading 0011 with 0000.
Binary | Dec | Hex | Glyph |
---|---|---|---|
0010 0000 | 32 | 20 | (blank) |
0010 0001 | 33 | 21 | ! |
0010 0010 | 34 | 22 | " |
0010 0011 | 35 | 23 | # |
0010 0100 | 36 | 24 | $ |
0010 0101 | 37 | 25 | % |
0010 0110 | 38 | 26 | & |
0010 0111 | 39 | 27 | ' |
0010 1000 | 40 | 28 | ( |
0010 1001 | 41 | 29 | ) |
0010 1010 | 42 | 2A | * |
0010 1011 | 43 | 2B | + |
0010 1100 | 44 | 2C | , |
0010 1101 | 45 | 2D | - |
0010 1110 | 46 | 2E | . |
0010 1111 | 47 | 2F | / |
0011 0000 | 48 | 30 | 0 |
0011 0001 | 49 | 31 | 1 |
0011 0010 | 50 | 32 | 2 |
0011 0011 | 51 | 33 | 3 |
0011 0100 | 52 | 34 | 4 |
0011 0101 | 53 | 35 | 5 |
0011 0110 | 54 | 36 | 6 |
0011 0111 | 55 | 37 | 7 |
0011 1000 | 56 | 38 | 8 |
0011 1001 | 57 | 39 | 9 |
0011 1010 | 58 | 3A | : |
0011 1011 | 59 | 3B | ; |
0011 1100 | 60 | 3C | < |
0011 1101 | 61 | 3D | = |
0011 1110 | 62 | 3E | > |
0011 1111 | 63 | 3F | ? |
Embedded within the next 32 characters embed the capital letters A through Z. These letters begin with the three bits 010 and if the character is a letter (0100 0001 through 0101 1010), it can be converted into the integer representing the position of that letter in the alphabet by replacing the out the leading 010 with 000.
For example, you will note that W is encoded using 101112 = 2310 and is the 23rd letter in the alphabet.
Binary | Dec | Hex | Glyph |
---|---|---|---|
0100 0000 | 64 | 40 | @ |
0100 0001 | 65 | 41 | A |
0100 0010 | 66 | 42 | B |
0100 0011 | 67 | 43 | C |
0100 0100 | 68 | 44 | D |
0100 0101 | 69 | 45 | E |
0100 0110 | 70 | 46 | F |
0100 0111 | 71 | 47 | G |
0100 1000 | 72 | 48 | H |
0100 1001 | 73 | 49 | I |
0100 1010 | 74 | 4A | J |
0100 1011 | 75 | 4B | K |
0100 1100 | 76 | 4C | L |
0100 1101 | 77 | 4D | M |
0100 1110 | 78 | 4E | N |
0100 1111 | 79 | 4F | O |
0101 0000 | 80 | 50 | P |
0101 0001 | 81 | 51 | Q |
0101 0010 | 82 | 52 | R |
0101 0011 | 83 | 53 | S |
0101 0100 | 84 | 54 | T |
0101 0101 | 85 | 55 | U |
0101 0110 | 86 | 56 | V |
0101 0111 | 87 | 57 | W |
0101 1000 | 88 | 58 | X |
0101 1001 | 89 | 59 | Y |
0101 1010 | 90 | 5A | Z |
0101 1011 | 91 | 5B | [ |
0101 1100 | 92 | 5C | \ |
0101 1101 | 93 | 5D | ] |
0101 1110 | 94 | 5E | ^ |
0101 1111 | 95 | 5F | _ |
Embedded within the last 32 characters embed the lower-case letters a through z. These letters begin with the three bits 011 and if the character is a letter (0110 0001 through 0111 1010), it can be converted into the integer representing the position of that letter in the alphabet by replacing the out the leading 011 with 000.
For example, you will note that w is coded using 101112 = 2310 and is the 23rd letter in the alphabet.
Binary | Dec | Hex | Glyph |
---|---|---|---|
0110 0000 | 96 | 60 | ` |
0110 0001 | 97 | 61 | a |
0110 0010 | 98 | 62 | b |
0110 0011 | 99 | 63 | c |
0110 0100 | 100 | 64 | d |
0110 0101 | 101 | 65 | e |
0110 0110 | 102 | 66 | f |
0110 0111 | 103 | 67 | g |
0110 1000 | 104 | 68 | h |
0110 1001 | 105 | 69 | i |
0110 1010 | 106 | 6A | j |
0110 1011 | 107 | 6B | k |
0110 1100 | 108 | 6C | l |
0110 1101 | 109 | 6D | m |
0110 1110 | 110 | 6E | n |
0110 1111 | 111 | 6F | o |
0111 0000 | 112 | 70 | p |
0111 0001 | 113 | 71 | q |
0111 0010 | 114 | 72 | r |
0111 0011 | 115 | 73 | s |
0111 0100 | 116 | 74 | t |
0111 0101 | 117 | 75 | u |
0111 0110 | 118 | 76 | v |
0111 0111 | 119 | 77 | w |
0111 1000 | 120 | 78 | x |
0111 1001 | 121 | 79 | y |
0111 1010 | 122 | 7A | z |
0111 1011 | 123 | 7B | { |
0111 1100 | 124 | 7C | | |
0111 1101 | 125 | 7D | } |
0111 1110 | 126 | 7E | ~ |
0111 1111 | 127 | 7F | Delete (DEL) |
The general positioning of the various control and printable characters are summarized in Figure 1.
Figure 1. The general layout of ASCII characters.