Neural networks are computing machines, with an internal organization partially resembling that used in the brain. Often termed ARTIFICIAL NEURAL NETWORKS (ANNs) to distinguish them from real thing, they may be constructed from special purpose silicon chips, but at present usually exist as computer programs. They can be visualized as a number of NETWORK NODES, analogous to NEURONS, connected together in a particular way. The connections form the architecture of the net. The purpose of this organization is that connections between nodes have ‘weights’, similar to SYNAPSES, that can be altered in value. The calculation performed by a neural network therefore depends both on its present inputs, and on the way its weights have been altered by past inputs. In this regard its behaviour resembles human LEARNING, and procedures for altering weights are often referred to as learning rules.
There are two main classes of learning rule. In UNSUPERVISED LEARNING, the weights in a net are altered according to the pattern of its inputs. The net, which may consist of a single layer of interconnected nodes, learns a method for classifying its inputs that depends on the precise nature of the learning rule, but does not require continuous external prompting. Such nets are described as self-organizing, and an example would be a KOHONEN NETWORK which learns input mappings resembling those observed in SENSORY CORTEX. In SUPERVISED LEARNING, by contrast, the net is trained to produce correct answers by using the difference between its output and an externally specified target output. The DELTA RULE is a procedure for computing the desired change in weights from this difference, called the error, for nets that have two layers. The nodes in the input layer receive input from an external source, and are connected only to nodes in the output layer, which communicate the results of the network’s calculations. However, a more powerful and popular architecture, the MULTI-LAYER PERCEPTION, has at least one additional layer of ‘hidden’ nodes. These receive input from the input nodes, and in turn are connected only to nodes in the output layer. For a multilayer perception the delta rule must be generalized, because the error signal needs to be propagated back to the hidden layer so the weights from the input layer can be altered correctly (see BACKPROPAGATION).
Supervised learning would be of limited use if it only reproduced already known correct answers. However, appropriately trained neural networks can generalize their behaviour to give correct answers to inputs never previously encountered. The power to generalize is a critical feature of ANNs and underlies both their applied and theoretical importance. It has been proved mathematically that multilayer perceptrons are able to approximate a very wide range of input-output functions. Yet further capabilities are added, in recurrent nets, by allowing connections from between nodes in the same layer, and from one layer to a preceding one. It is not only the computational power of ANNs that is relevant to psychology though: the distinctive way in which the computations are done is also of interest. In contrast to traditional computing procedures, whereby a single stream of instructions is executed one step at a time, processing in ANNs is distributed over the nodes and (in theory) carried out in parallel. PARALLEL DISTRIBUTED PROCESSING has intriguing resemblances to human learning and performance. ANNs can carry out tasks requiring complex statistical calculation with no explicit mathematical knowledge. They may continue to produce approximate answers when damaged, or when their inputs are corrupted. They are particularly suited to ‘real-world’ problems, such as PATTERN PERCEPTION and ASSOCIATIVE LEARNING, which have many constraints no single one of which is decisive (see ARTIFICIAL INTELLIGENCE).
As models of human processing, ANNs are used at different levels of fidelity. In COMPUTATIONAL NEUROSCIENCE, network nodes can be made similar to real neurons in intrinsic structure, the connections between them made to conform to known neuroanatomy, and the weight modification rules made to resemble those governing the efficacy of real synapses (the equivalent of, for instance, LONG-TERM POTENTIATION). Such networks may be useful for understanding the detailed organization of particular regions of the brain, such as CEREBRAL CORTEX, HIPPOCAMPUS and CEREBELLUM. In CONNECTIONISM, more abstract neural networks are used as existence proofs that a biologically plausible organization can be used to perform certain classes of computation. Areas studied include VISUAL PERCEPTION, MOTOR CONTROL, MEMORY and LANGUAGE. The networks used for these studies typically consist of highly simplified nodes in architectures whose relation to those used by the brain is unknown. Debates about their processing capacities are wide ranging, including topics such as the nature of human DECISION-MAKING and CONSCIOUSNESS. One prominent area of dispute is whether neural networks can generate precise symbolic representations whose behaviour is governed by formal logical, grammatical or mathematical rules (see ARTIFICIAL INTELLIGENCE again).