Three common code sets are:
ASCII (used
in UNIX and DOS/Windows-based computers)
EBCDIC (for IBM
System 390 main frames)
Unicode (for Windows NT and recent browsers)
The ASCII code set uses 7 bits per character, allowing 128 different characters. This
is enough for the alphabet in upper case and lower case, the symbols on a regular
English typewriter, and some combinations reserved for internal use. An extended ASCII code set
uses 8 bits per character, which adds another 128 possible characters. This larger code
set allows for foreign languages symbols and several graphical symbols.
ASCII has been superceded by other coding schemes in modern
computing. It is still used for transferring plain text data between
different programs or computers that use different coding schemes.
If you're curious to see the table of ASCII and EBCDIC codes, see Character Codes.
Unicode uses 16 bits per character,
so it takes twice the storage space that ASCII coding, for
example, would take for the same characters. But Unicode can handle many more characters. The goal of Unicode is to represent every element used in every script for writing every language on the planet. Whew! Quite a task!
Version 3 of Unicode has 49,194 characters instead of the wimpy few hundred for ASCII and EBCDIC. All of the current major languages in the world can be written with Unicode, including their special punctuation and symbols for math and geometry.
At the Unicode
site you can view sections
of the Unicode code charts. The complete list is too
long to put on one page! |