The standard deviation is a statistical measure of the dispersion or uncertainty in a random variable. The standard deviation is the square root of the variance, a measure of how spread out a distribution is, and is written for a random variable x as: = [Ni= 1(xi - x)2/N - 1], is the standard deviation, xi is each individual value of x, is the mean of all of the data points, and N is the number of data points. In this form the standard deviation is an unbiased estimate of the variance and is sometimes said to be corrected for a loss of a degree of freedom. On occasion the standard deviation is written as a biased estimate of the variance when the denominator is just N rather than N - 1. The value of xi - x is called the residual for each x value. The standard deviation of a set of data is the most commonly used measure of the spread of that data. If the standard deviation is small then the data points are tightly clustered around the mean value. If it is large then they are widely scattered relative to the mean value.
The variance is a somewhat abstract measure of the variability in a set of data. Unlike the variability the standard deviation can be easily conceptualized by plotting it along with the individual points in the set. It is easy to visualize the standard deviation in this way along with the data set.
An important characteristic of the standard deviation is as a measure of the percentile rank. Standard deviation is a measure of the spread of a set of data. If that set is a normal distribution, a set of data with the concentration of points in the middle of the distribution, and the mean is known, then it is possible to compute the percentile rank associated with any particular data point value of that set. The percentile rank is the proportion of values in a distribution that one specific value is greater than or equal to.
Because of its mathematical tractability the standard deviation is a useful measure of the spread in a set of data and is often employed in inferential statistics. Inferential statistics are a type of statistics used to draw inferences about populations from only a sample of that population. Where there is a possibility of extreme values being present in a set of data the standard deviation should be supplemented by the semi-interquartile range. This is because the standard deviation is more sensitive than the semi-interquartile range to extreme values. The standard deviation can be thrown off by an extreme value in a set of data.
There are special ways to calculate the standard deviation for particular cases. If f(x) is a linear transformation of x such that f(x) = bx + a, then the standard deviation of f(x) is bx where 2x is the variance of x. In this case the variance of f(x) is b22x.
Standard deviation has many uses in statistical analysis of data sets. Along with its usefulness in the mathematical world it is also widely used in the financial world. If a financial variable is highly volatile then it has a high standard deviation. Often times the standard deviation is used as a measure of the volatility of a random financial variable. Standard deviation is also widely used by the United States census bureau in calculating and interpreting the census results from year to year.
This is the complete article, containing 578 words
(approx. 2 pages at 300 words per page).