What do measures of variation describe
The intermediate results are not rounded. This is done for accuracy. Use your calculator or computer to find the mean and standard deviation. Then find the value that is two standard deviations above the mean. The deviations show how spread out the data are about the mean. The data value A positive deviation occurs when the data value is greater than the mean, whereas a negative deviation occurs when the data value is less than the mean.
The deviation is —1. If you add the deviations, the sum is always zero. So you cannot simply add the deviations to get the spread of the data. By squaring the deviations, you make them positive numbers, and the sum will also be positive. The variance, then, is the average squared deviation. The variance is a squared measure and does not have the same units as the data.
Taking the square root solves the problem. The standard deviation measures the spread in the same units as the data. The answer has to do with the population variance. The sample variance is an estimate of the population variance. Your concentration should be on what the standard deviation tells us about the data. The standard deviation is a number which measures how far the data are spread from the mean.
Let a calculator or computer do the arithmetic. When the standard deviation is zero, there is no spread; that is, all the data values are equal to each other. The standard deviation is small when the data are all concentrated close to the mean, and is larger when the data values show more variation from the mean. The standard deviation, when first presented, can seem unclear. By graphing your data, you can get a better "feel" for the deviations and the standard deviation.
You will find that in symmetrical distributions, the standard deviation can be very helpful but in skewed distributions, the standard deviation may not be much help. The reason is that the two sides of a skewed distribution have different spreads. In a skewed distribution, it is better to look at the first quartile, the median, the third quartile, the smallest value, and the largest value.
Because numbers can be confusing, always graph your data. Display your data in a histogram or a box plot. Use the following data first exam scores from Susan Dean's spring pre-calculus class:. The long left whisker in the box plot is reflected in the left side of the histogram. The histogram, box plot, and chart all reflect this. There are a substantial number of A and B grades 80s, 90s, and The histogram clearly shows this.
The following data show the different types of pet food stores in the area carry. Recall that for grouped data we do not know individual data values, so we cannot describe the typical value of the data with precision. In other words, we cannot find the exact mean, median, or mode. We can, however, determine the best estimate of the measures of center by finding the mean of the grouped data with the formula:. Just as we could not find the exact mean, neither can we find the exact standard deviation.
Remember that standard deviation describes numerically the expected deviation a data value has from the mean.
This means that a randomly selected data value would be expected to be 3. If we look at the first class, we see that the class midpoint is equal to one. This is almost two full standard deviations from the mean since 7. It is usually best to use technology when performing the calculations. For the previous example, we can use the spreadsheet to calculate the values in the table above, then plug the appropriate sums into the formula for sample standard deviation.
Input the midpoint values into L1 and the frequencies into L2. Select 2 nd then 1 then , 2 nd then 2 Enter. The standard deviation is useful when comparing data values that come from different data sets. If the data sets have different means and standard deviations, then comparing the data values directly can be misleading.
In symbols, the formulas become:. Two students, John and Ali, from different high schools, wanted to find out who had the highest GPA when compared to his school.
Which student had the highest GPA when compared to his school? Pay careful attention to signs when comparing and interpreting the answer. John's z -score of —0. Two swimmers, Angie and Beth, from different teams, wanted to find out who had the fastest time for the 50 meter freestyle when compared to her team.
Which swimmer had the fastest time when compared to her team? The following lists give a few facts that provide a little more insight into what the standard deviation tells us about the distribution of the data. The standard deviation can help you calculate the spread of data.
However, suppose we add a fourth section, Section D, with scores 0 5 5 5 5 5 5 5 5 This section also has a mean and median of 5. The range is 10, yet this data set is quite different than Section B. The standard deviation is a measure of variation based on measuring how far each data value deviates, or is different, from the mean. A few important characteristics:. Using the data from section D, we could compute for each data value the difference between the data value and the mean:.
Ordinarily we would then divide by the number of scores, n , in this case, 10 to find the mean of the deviations.
These values 5 and 5. Variance can be a useful statistical concept, but note that the units of variance in this instance would be points-squared since we squared all of the deviations. What are points-squared?
Good question. We would rather deal with the units we started with points in this case , so to convert back we take the square root and get:. If we are unsure whether the data set is a sample or a population, we will usually assume it is a sample, and we will round answers to one more decimal place than the original data, as we have done above. Find the deviation of each data from the mean. In other words, subtract the mean from the data value.
Divide by n , the number of data values, if the data represents a whole population; divide by n — 1 if the data is from a sample. Computing the standard deviation for Section B above, we first calculate that the mean is 5. Using a table can help keep track of your computations for the standard deviation:. Assuming this data represents a population, we will add the squared deviations, divide by 10, the number of data values, and compute the square root:. Notice that the standard deviation of this data set is much larger than that of section D since the data in this set is more spread out.
See editing example. The interquartile range gives you the spread of the middle of your distribution. The interquartile range is the third quartile Q3 minus the first quartile Q1. This gives us the range of the middle half of a data set. Multiply the number of values in the data set 8 by 0.
Q1 is the value in the 2nd position, which is Q3 is the value in the 6th position, which is The interquartile range of your data is minutes. Just like the range, the interquartile range uses only 2 values in its calculation. But the IQR is less affected by outliers: the 2 values come from the middle half of the data set, so they are unlikely to be extreme scores.
Standard deviation The standard deviation is the average amount of variability in your dataset. It tells you, on average, how far each score lies from the mean. The larger the standard deviation, the more variable the data set is.
The standard deviation of your data is This means that on average, each score deviates from the mean by Samples are used to make statistical inferences about the population that they came from. When you have population data, you can get an exact value for population standard deviation. Since you collect data from every population member, the standard deviation reflects the precise amount of variability in your distribution, the population. But when you use sample data, your sample standard deviation is always used as an estimate of the population standard deviation.
Using n in this formula tends to give you a biased estimate that consistently underestimates variability. Reducing the sample n to n — 1 makes the standard deviation artificially large, giving you a conservative estimate of variability. While this is not an unbiased estimate, it is a less biased estimate of standard deviation: it is better to overestimate rather than underestimate variability in samples.
As you compare prices of various brands, some offer price per roll while others offer price per sheet. You are interested in determining which pricing method has less variability so you sample several of each and calculate the mean and standard deviation for the sampled items that are priced per roll, and the mean and standard deviation for the sampled items that are priced per sheet.
The table below summarizes your results. Comparing the standard deviations the Per Sheet appears to have much less variability in pricing. However, the mean is also much smaller. The coefficient of variation allows us to make a relative comparison of the variability of these two pricing schemes:. Relatively speaking, the variation for Price per Sheet is greater than the variability for Price per Roll. Breadcrumb Home 1 1. Font size. Font family A A. Content Preview Arcu felis bibendum ut tristique et egestas quis: Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris Duis aute irure dolor in reprehenderit in voluptate Excepteur sint occaecat cupidatat non proident.
Lorem ipsum dolor sit amet, consectetur adipisicing elit. Odit molestiae mollitia laudantium assumenda nam eaque, excepturi, soluta, perspiciatis cupiditate sapiente, adipisci quaerat odio voluptates consectetur nulla eveniet iure vitae quibusdam? Excepturi aliquam in iure, repellat, fugiat illum voluptate repellendus blanditiis veritatis ducimus ad ipsa quisquam, commodi vel necessitatibus, harum quos a dignissimos. Close Save changes. Help F1 or?
Range The range is the difference in the maximum and minimum values of a data set.
0コメント