Normal Curve
In statistics, the term "normal curve" refers to a specific member of an entire family of curves that share a number of important characteristics. First, they are curves that model probability distributions for random variables. Second, they are all symmetric about the vertical line with equation of the form x = , where is the mean of the probability distribution. Third, all normal curves are shaped like the profile of a bell and are, therefore, sometimes called "bell curves." Fourth, the edges of the bell are infinite in length and approach the horizontal axis (the line y = 0) asymptotically from above. Fifth, since the curves represent probability distributions, the total area under each normal curve is 1. Finally, members of the family of normal curves have equations of the form y = (1/((2)))e((-1/2)(x-)/)2), where is the mean of the given distribution and is the standard deviation. These formulas are derived in advanced courses in mathematical statistics, but the typical researcher uses statistical tables generated from these formulas without needing to understand the derivations.
Normal curves are important in statistics because many kinds of natural phenomena and human-generated experiments produce data sets that can be very accurately modeled by them. For example, heights, weights, sizes of heads, arm lengths, shoe sizes, and other such physical measurements, tend to be normally distributed for random samples of certain groups of people and animals. The scores on standardized achievement tests for nationwide samples of students also tend to be normally distributed. The great German mathematician Carl Friedrich Gauss observed that the distribution of errors in certain astronomical measurements could be modeled using a normal curve. For this reason, normal curves are often called "Gaussian" curves to honor Gauss's pioneering work in this area of statistics.
The most important and useful member of the family of normal curves is the one having a mean of 0 and a standard deviation of 1. This curve is called the "standard normal curve" and has the equation y = (1/(2)) e(-1/2)x2. All other normal curves may be mapped onto the standard normal by using mathematical transformations. When a normal distribution is transformed in this way, it is said to have been "standardized." The importance of standardization is that a single probability distribution table, called the standard normal table, may then be used to calculate probabilities for any normal distribution. Probabilities are represented by areas of regions under a normal curve. Since the normal curve is symmetric, exactly 50% of the population will fall below the mean and 50% of the population will fall above the mean. For the standard normal distribution, this means that 50% of the area under the curve lies in the region to the right of 0 and 50% lies below 0. Also, it can be shown that approximately 68% of the area under the standard normal curve will lie between -1 and 1. Similarly, about 95% of the area will fall between -2 and 2 and about 99.7% will lie between -3 and 3. To illustrate the process of standardizing a set of data, consider the case of an achievement test given to students across the nation resulting in a normal distribution of scores with a mean of 500 and a standard deviation of 100. Since the mean of this data is 500 units to the right of the origin, we translate this data 500 units to the left so that the center of the distribution is now 0, which is the mean of the standard normal distribution. There is still a problem, though. The standard deviation of the original data is 100, not 1. Therefore, we must also scale all the data values by a factor of 1/100 (or, equivalently, divide each one by 100) in order to complete the mapping of the data onto the standard normal curve. The composite of these two transformations may be represented by the mathematical expression (x - 500)/100, where x represents any score on the test. Thus a score of 600 on the test would correspond to (600-500)/100 = 1 on the standard normal table. The table gives areas under the standard normal curve. It tells us that approximately 14% of the area lies to the right of 1. We can interpret this to mean that approximately 14% of the students who took this test scored 600 or above.
Using the method described in the preceding paragraph, researchers, statisticians, scientists, and other individuals use areas under the standard normal curve to determine the percentage of a population that falls within a particular interval. Typically, a sample of the desired population is collected. A normal distribution, based on the sample mean and sample standard deviation, may then be used to estimate the information desired about the population represented by the sample. The science of creating representative samples of a desired population in order to model it is a major underpinning of the science of statistics.
This is the complete article, containing 817 words
(approx. 3 pages at 300 words per page).