Neural Networks
A neural network is a biologically inspired form of computing device. Neural networks are parallel computing systems that hold the promise--according to their advocates--of being someday able to perform common cognitive tasks, such as pattern and speech recognition, that are hard for more common computational architectures yet are performed almost effortlessly by the human brain. The slight success seen by neural networks in these arenas so far is thanks to their parallel architecture, which mimics, albeit at a far lower level of complexity, certain aspects of our own brains.
In a human brain, the basic information-processing unit is a cell called the neuron. A neuron consists of a cell body where the nucleus resides. Networks of branching fibers called dendrites communicate with other neurons across synaptic gaps, which are the routes chemical signals travel. An action potential is carried from its cell body along a long fiber called an axon, which in turn branches near its end and connects to as many as a few thousand other neurons' dendrites. The transmission of signals from one neuron to another across the junction between axon and dendrite, the synapse, is a complex chemical process in which neurotransmitters (special chemicals floated like messages through the thin fluid layer between one cell and the next) lower the electrical potential (voltage) across the receiving cell's outer membrane. If a cell receives enough input from other neurons, its potential is lowered so far that it fires--sends an electrical signal down its axon, which can in turn help induce other neurons to fire. The human brain consists of about 1011 such neurons all acting in parallel (simultaneously), with synaptic connections between neurons numbering in the trillions. Even the most complex computers--conventional or neural--are thousands or millions of time times less complex than the human brain.
Artificial neural networks (ANNs) abstract a few simple ingredients from the huge complexity of the human brain. An ANN consists of a set of N nodes ni (0 < i < N) corresponding to neurons, and a set of connection weights wij (0 < i < N, 0 < j < N) representing the strength of the connection from node j to node i. At any given time step each node ni has an output oi. To determine its output value at the next time-step, node ni first computes a weighted sum of the outputs of all other nodes nj that are connected to it, where each output oj received by node ni is weighted by the connection weight wij. In other words, this weighted sum of outputs is the net input to node ni. The resulting output of node ni is then some function f of this input, where f is an activation function characteristic of the node. There are many possible choices for the activation function, the simplest being a simple threshold function which yields 1 if the weighted sum input is positive and is 0 otherwise. Other possibilities are smoothed threshold functions or sigmoidal curves. This setup captures, though in a simplistic way, essential aspects of the workings of real neurons in the brain.
The next important ingredient in the design of an ANN is the topology or layout of connections between neurons (nodes) in the network. Usually most neurons in an ANN can be categorized into three classes: input neurons, output neurons, and hidden neurons. The inputs to input neurons are provided by an outside source, whereas the values of the output neurons represent the ANN's response to that source. For example, if the ANN is a pattern classifier then the input neurons would take on a set of values that would somehow encode the pattern to be classified, and the output would represent the ANN's answer to what class the pattern falls into. If the ANN were part of a control system then the input might be some stimulus (e.g., video image of approaching truck), while the output would encode some response or control action needed to influence the system as desired (e.g., get out of the way!). Hidden neurons, on the other hand have no connections to the outside world. They exist only as intermediate neurons on pathways from input to output neurons, and their output values may not have a clear interpretation. Sometimes in ANN pattern recognition systems the values of hidden neurons can denote the presence or absence of interesting features in the input pattern to be recognized.
In an abstract sense an ANN does nothing more than compute a mapping between its input and output neurons. Once the network topology and activation functions are chosen for the ANN, this mapping is fully determined solely by the connection weights wij. The knowledge implicit in a neural network's structure lies, then, in the connection weights and the topography of the net. A fundamental question is, how does one choose these weights to get the neural network to accomplish its desired goals, that is, its desired mapping between inputs and outputs?
The notion of modifying the weights in order to improve network performance is called training or learning. There are two major modes of learning: supervised and unsupervised. In supervised learning a teacher is present to provide a set of examples of good input-output mappings. The network uses these examples to learn what it should do in the more general case. Unsupervised learning is simpler, slower, and involves no teacher. The simplest example of unsupervised learning is Hebbian learning. The principle behind this approach is that if one neuron tends to persistently cause another neuron to fire, then this activity should be encouraged and the weight between these two neurons should be increased. There is biological evidence that Hebbian learning goes on in the human brain, that is, that the brain learns by reinforcing correlations between firing neurons. For pattern classifiers that are trained using Hebbian learning it is not always clear what exactly the outputs mean, but usually when a neural network is presented with a sequence of patterns that have some redundancy in them and its weights are modified in a Hebbian fashion, its outputs tend to categorize the input pattern. If the output neurons yield a set of distinct functions, for example, the neural network may be performing a clustering algorithm in which the network organizes the set of all input patterns into clusters of patterns that are more similar to each other than to those in other clusters. Hebbian learning can thus be a method of automatically identifying such regularities in input patterns. Another possible application of unsupervised learning is data compression, where the output represents a smaller, encoded version of the input that has done away with all redundancies.
In supervised learning one can make use of training data, namely a set of examples of inputs and what the corresponding outputs should be, in order to force the network to encode certain behaviors. For example in one military pattern recognition project, the examples consisted of pictures of tanks and pictures of trees. The desired outputs in such a case might be 1 for tanks and 0 for trees. In order to train the ANN to make this distinction one might use a simple "hill-climbing" procedure. We recall that the neural network's output is a function of both its weights and its inputs. One can create an error function that measures how poorly the network is doing simply by calculating for any training-set input the squared difference between the network's actual output and the desired output and summing this squared difference over all examples in the training set. This error function is then viewed as a function of the weights. The best possible network would minimize this function with respect to the weights. This minimization problem is then solved by gradient descent, namely by starting with a random set of weights, then moving in the weight space at every point in that direction that decreases the error function the most. This algorithm can be implemented quite efficiently in a feed-forward network, which consists of an input layer, a series of hidden layers, and finally an output layer, with connections only going forward from layer to layer in the direction of input to output. In this type of topology weights can be modified layer by layer, starting from the output layer, in such a way that the modifications are propagated backwards. This implementation of the gradient-descent training procedure is called backpropagation and was a breakthrough in the field of neural networks when it was discovered by P. Werbos in 1974. Mathematically, the whole procedure can be viewed as a least-squared-error fit of the neural network's input-to-output mapping function with respect to the weights. An interesting aspect of this approach is that it is not always clear what the neural network will learn from the training. For example, in the military application, it just so happened that all the pictures of tanks were taken on a sunny day whereas pictures of trees were taken on a cloudy day. When trained on this data set the neural network failed to perform adequately on new pictures of tanks and trees; it turned out that all the network had really learned to do was distinguish sunny weather from cloudy weather.
This is the complete article, containing 1,528 words
(approx. 5 pages at 300 words per page).

Neural Networks article
Copyrights
Neural Networks from World of Computer Science. ©2005-2006 Thomson Gale, a part of the Thomson Corporation. All rights reserved.