ASCII

ASCII is a very well known coding system. It is an 8 bit system and is used in computer systems to allow characters and symbols to be represented as strings of numbers. At the fundamental level a computer deals in lists of numbers.

Computers operate on a binary system, in other words they have only two digits, these are 'zero' and 'one'.

To count, we begin as usual. 0, 1 .... then run out of digits. When counting in decimal we run out of digits at 9, and then move to the next column.

So we have: 0, 1, 10, 11, 100, 101, 110, 111, 1000.

 0   0000   0
 1   0001   1
 2   0010   2
 3   0011   3
 4   0100   4
 5   0101   5
 6   0110   6
 7   0111   7
 8   1000   8
 9   1001   9
10   1010   A
11   1011   B
12   1100   C
13   1101   D
14   1110   E
15   1111   F
16  10000  10

This all leads to the old joke that there are 10 kinds of people in the world, those who understand binary and those who don't.

Note that the columns aren't 'units', 'tens', hundreds', but are instead '1' , '2', '4', '8', '16' etc.

Using this we can decode a binary number: 01001010 is 1 lot of '2', 1 lot of '8' and 1 lot of '64', So 01001010 is 74 in decimal.

The right hand column above is 'hexadecimal'. It's a counting system with 16 digits, 0 through to F, when we run out of digits we move to the next column. This is convenient for using computers as strings of bits can be broken up into lumps of 4 digits. So 01001010 becomes 0100 1010, and in hex this is written as 4A (which means 4 lots of 16, plus 10 ('A' is 10 in decimal). Hex is most commonly seen today when specifying colours in webpages, or when using painting packages. To differentiate these systems, people oftem write binary like this: %10 (binary for '2') and hex like this: $10 (hex for '16)

You should be able to see that with 8 bits, we can easily represent 256 different numbers (0 to 255 in decimal, $00 to $FF in Hex)

The ASCII code assigns a different item to each number. The ASCII code only defines 0 through to 127 (%00000000 to %01111111), the remaining codes could be used for other purposes - several extended ascii codes were used, these would include other characters such as 'é'

This is the ASCII chart. The number down the side is the first digit, and the number along the top is the second digit. The 'low' codes tend to be reserved for 'control' codes.

Thus $20 ('32' in decimal) is a space, and 'A' is $41 (65 in decimal)

    0   1   2   3   4   5   6   7   8   9   A   B   C   D   E   F
0  NUL SOH STX ETX EOT ENQ ACK BEL BS  HT  LF  VT  FF  CR  SO  SI
1  DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM  SUB ESC FS  GS  RS  US
2   SP  !   "   #   $   %   &   '   (   )   *   +   ,   -   .   /
3   0   1   2   3   4   5   6   7   8   9   :   ;   <   =   >   ?
4   @   A   B   C   D   E   F   G   H   I   J   K   L   M   N   O
5   P   Q   R   S   T   U   V   W   X   Y   Z   [   \   ]   ^   _
6   `   a   b   c   d   e   f   g   h   i   j   k   l   m   n   o
7   p   q   r   s   t   u   v   w   x   y   z   {   |   }   ~ DEL

Some of the control codes are quite fun, $07 is a 'bell'. When I was at school we used Lynx computers with a single printer in the room. We discovered that sending an ascii '$07' to the printer would make it beep. We soon were using it to send morse messages to each other. A primative instant messaging system!

Note that whilst 'A' is $41, 'a' is $61. In binary, this is %00100001 and %01100001 - the lower case and upper case characters differ by a single bit. This is quite intentional.

Of course, ASCII is rather limited, especially when we consider the many languages around the world. This is why there is much work on Unicode