Unicode
Unicode is a standard for the representation of characters as integers. Another means of character representation, the American Standard Code for Information Interchange (ASCII), is a more commonly used form of character representation. Although ASCII uses only eight bits for each character, Unicode uses 16 bits to represent each character. This means that Unicode is capable of representing more than 65,000 unique characters. By comparison, ASCII's capacity is only 128 characters. For the English language and Western-European languages, the full character capability of Unicode is not utilized. However, languages such as Greek, Chinese and Japanese cannot be fully represented without the power of Unicode.
The Unicode standard was created in 1991 by a team of computer professionals, linguists and scholars. Since the first version (1.0) several versions have been released (1.1, 2.0, 2.1, and 3.0). Version 4.0 of Unicode is expected within the next several years.
In its inception, provision was made in Unicode for every character, punctuation mark, and symbol for every spoken language. In fact, currently there are over 29,000 unused codes. This will allow for expansion of Unicode to include new characters such as hieroglyphics.
Fundamentally, computers deal with numbers. They store letters and other characters by assigning a number for each one. Before Unicode was invented, there were hundreds of different encoding systems for assigning these numbers. This was because no single encoding, such as ASCII, could contain enough characters. Even for English, no single encoding was adequate for all the letters, punctuation, and technical symbols in common use. A number of coding systems can also cause problems, as encoding systems can conflict with one another.
Unicode is able to provide a unique number for every character in a way that is independent of the operating system or programming language being used. The design of Unicode is currently controlled by two co-operating organizations. The first organization is the Unicode Consortium, a nonprofit special interest group founded in 1991 to promote Unicode. The Consortium is comprised mainly of American software manufacturers with an interest in Unicode. The second organization is a sub-committee of the International Organization for Standardization and the International Electrochemical Commission.
Unicode is growing in popularity. The Unicode Standard, which specifies the design and modifications to Unicode, has been adopted by prominent companies such as Apple, Hewlett-Packard, International Business Machines, JustSystem, Microsoft, Oracle, SAP, Sun, and Sybase. Furthermore, Unicode is required by modern programming standards including XML, Java, CORBA, and WML. The emergence of the Unicode Standard, and the creation of programming tools to support it, such as software for Arabic, Russian, Hebrew, Japanese Kana and Korean languages, is one of the most significant global software trends of recent times.
As the software industry continues to orient more globally, the need for Unicode will continue to grow. Some analysts have predicted that Unicode will someday supplant ASCII as the standard character-coding format.
This is the complete article, containing 471 words
(approx. 2 pages at 300 words per page).