BookRags.com Literature Guides Literature
Guides
Criticism & Essays Criticism &
Essays
Questions & Answers Questions &
Answers
Lesson Plans Lesson
Plans
My Bibliography Periodic Table U.S. Presidents Shakespeare Sonnet Shake-Up
Research Anything:        
History | Encyclopedias | Films | News | Create a Bibliography | More... Login | Register | Help

Backpropagation

Print-Friendly
About 3 pages (741 words)
Backpropagation Summary

Bookmark and Share Questions on this topic? Just ask!

Backpropagation, or propagation of error, is a common method of teaching artificial neural networks how to perform a given task. It was first described by Paul Werbos in 1974, but it wasn't until 1986, through the work of David E. Rumelhart, Geoffrey E. Hinton and Ronald J. Williams, that it gained recognition, and it led to a “renaissance” in the field of artificial neural network research. It is a supervised learning method, and is an implementation of the Delta rule. It requires a human teacher, who knows, or can manually calculate, the desired output for any given input. It is most useful for feed-forward networks (networks that have no feedback, or simply, that have no connections that loop). The term is an abbreviation for "backwards propagation of errors". Backpropagation requires that the transfer function used by the artificial neurons (or "nodes") be differentiable.

Summary

Summary of the technique:

  1. Present a training sample to the neural network.
  2. Compare the network's output to the desired output from that sample. Calculate the error in each output neuron.
  3. For each neuron, calculate what the output should have been, and a scaling factor, how much lower or higher the output must be adjusted to match the desired output. This is the local error.
  4. Adjust the weights of each neuron to lower the local error.
  5. Assign "blame" for the local error to neurons at the previous level, giving greater responsibility to neurons connected by stronger weights.
  6. Repeat the steps above on the neurons at the previous level, using each one's "blame" as its error.

Algorithm

Backpropagation Algorithm:

  1. N = W[i] Initialize the Network N for each weight W (often random)
  2. repeat foreach example E in the training set E[i] do:
         1. O = neural-net-output(N, E) ; forward pass
         2. T = teacher output for this E
         3. (T - O) calculate error at the output units
         4. Compute delta Wi for all weights from the output layer to the hidden layer (start backward pass)
         5. Compute delta Wi for all weights from the hidden layer to the input layer to (backward pass continued)
         6. Wi = Wi - delta Wi (Update the weights in the network)
         7. end when either:
              a. E is classified correctly
              b. delta Wi appoaches zero
              c. other limiting or stopping criterion is reached (timeout, iterations, bounds)
  3. return the new trained Network N

As the algorithm's name implies, the errors (and therefore the learning) propagate backwards from the output nodes to the inner nodes. So technically speaking, backpropagation is used to calculate the gradient of the error of the network with respect to the network's modifiable weights. This gradient is almost always then used in a simple stochastic gradient descent algorithm to find weights that minimize the error. Often the term "backpropagation" is used in a more general sense, to refer to the entire procedure encompassing both the calculation of the gradient and its use in stochastic gradient descent. Backpropagation usually allows quick convergence on satisfactory local minima for error in the kind of networks to which it is suited. It is important to note that backpropagation networks are necessarily multilayer perceptrons (usually with one input, one hidden, and one output layer). In order for the hidden layer to serve any useful function, multilayer networks must have non-linear activation functions for the multiple layers: a multilayer network using only linear activiation functions is equivalent to some single layer, linear network. Non-linear activation functions that are commonly used include the logistic function, the softmax function, and the gaussian function. The backpropagation algorithm for calculating a gradient has been rediscovered a number of times, and is a special case of a more general technique called automatic differentiation in the reverse accumulation mode. It is also closely related to the Gauss-Newton algorithm, and is also part of continuing research in neural backpropagation.

External links

View More Summaries on Backpropagation
More Information
  • View Backpropagation Study Pack
  • Search Results for "Backpropagation"
  • Add This to Your Bibliography
  • More Products on This Subject
    Backpropagation
    A method of training neural nets to produce correct answers. In SUPERVISED LEARNING, the difference between the output of the net and an externally specified target output, termed the error, is used to adjust the weights on the NETWORK NODES. The DELTA R... more


     
    Ask any question on Backpropagation and get it answered FAST!
    Answer questions in BookRags Q&A and earn points toward
    discounted or even FREE Study Guides and other BookRags products!
    Learn more about BookRags Q&A
    Copyrights
    Backpropagation from Wíkipedia. ©2006 by Wíkipedia. Licensed under the GNU Free Documentation License. View a list of authors or edit this article.

    Article Navigation
    Join BookRagslearn moreJoin BookRags




    About BookRags | Customer Service | Report an Error | Terms of Use | Privacy Policy