Information Theory
While researching methods for how to more efficiently transmit information over noisy communications channels, Claude Shannon, an electrical engineer, published "A Mathematical Theory of Communication" in 1948, which spawned two disciplines--information theory and coding theory. Shannon's paper captured the basic mathematical principles relevant to transmitting, receiving, and processing information via unspecified communications media, as well as fundamentally redefining how communications engineers and specialists perceive information. At its essence, information theory is a combination of elements of communications theory, probability, and statistics. Beyond providing a means to numerically measure the quantity of information to be transmitted, information theory also encompasses how the information is to be represented (that is, coded) during transmission, as well as the capacity of the communications system to transmit, receive, process, and store the information.
At its most basic level, a communications system consists of a message source (such as a telegraph, broadcast television station, or radio antenna), a device for encoding the message (which may be hardware, software, or a combination of both and which Shannon termed the transmitter), a communications channel over which the message is transmitted, a device for decoding the message (which generally performs an inverse function to the encoder and which Shannon termed the receiver of a message), and destination of the message (the person or thing for whom the message is intended). The message can take nearly any form imaginable, such as a sequence of alphanumeric characters (a document or electronic mail, for example), a sequence of numbers, (a bit stream such as 1 0 1 1 1 0 0 1, for example, or credit card number), or even ultra-high frequency (UHF) radio waves. Communication channels that attenuate, or distort, the message during transmission are called noisy, and the amount of noise added to the transmitted message will vary according to the construction and design of the communications channel, and perhaps even atmospheric conditions. Generally speaking, less noise results in fewer errors when the message is decoded, and information theory provides a means to mathematically evaluate how much noise a particular communications channel is adding to the transmitted message, as well as how much noise can be accepted before the message becomes garbled beyond recognition. It also provides a mechanism for determining the bandwidth--or capacity--required for transmitting a particular message.
Information theory rests on the idea that of the entire message to be transmitted, only the parts which provide new, or non-redundant, data are relevant. Telegraphers, for example, often omitted the words "the," "a," and "an" from messages. Including these articles was both expensive and redundant--the message recipient could easily determine where these articles should be re-inserted into the message. Omitting redundant articles, then, is a rudimentary form of data compression. Since only the data that is not redundant is relevant and in thus transmitted, information can be considered the probability that the actual message is selected from the set of all possible messages once the redundancies are removed.
In basic digital communication systems, information is transmitted in bits, an abbreviation for binary digits. Under a uniform binary number system, one has two choices--zero or one--both of which are equally likely and thus each is assigned a probability of one-half. This means that for any given message of length one bit, the chance that a zero was transmitted is 50 percent and the chance that a one was transmitted is also 50 percent. If there are N equally likely possibilities, however, the number of bits needed to transmit the information equals the base 2 logarithm of N (that is, log2 N). If the N possibilities have unequal probabilities, then the number of bits associated with the message is the sum of base 2 logarithms of the reciprocals of the probabilities. If the N possibilities have probabilities p1, p2,..., pN, for example, the number of bits associated with the message will be log2 (1 / p1) + log2 (1 / p2) + ... + log2 (1 / pN). The expected value of the number of bits of information (that is, the number of bits of information, on average, that each message will hold) is known as the entropy.
One of the most useful results of information theory is Shannon's relationship between information capacity I (the amount of information--or number of bits--that can be transmitted over a given communications channel in a specific amount of time), bandwidth B (the size of the communications channel), and the signal-to-noise ratio (S / N) (the strength of the message relative to the amount of noise introduced by the communications channel, with both signal strength and noise strength often measured in watts). Shannon concluded that I = B * log2 [1 + (S / N)]. This relationship provides an upper bound for the information capacity of communications channels that is still used today.
This is the complete article, containing 793 words
(approx. 3 pages at 300 words per page).