Data Description & Population Variance, Median, Mean, Mode

Data Description & Population Variance (Statistics)

These Statistics chapters discuss Data description, population variance, median, mean and mode.


Midrange

The mean of the highest and lowest values. (Max + Min) / 2


Parameter

Characteristic or measure obtained from a population


Mean

Sum of all the values divided by the number of values. This can either be a population mean (denoted by mu) or a sample mean (denoted by x bar)


Median

The midpoint of the data after being ranked (sorted in ascending order). There are as many numbers below the median as above the median.


Mode

The most frequent number


Skewed Distribution

The majority of the values lie together on one side with a very few values (the tail) to the other side. In a positively skewed distribution, the tail is to the right and the mean is larger than the median. In a negatively skewed distribution, the tail is to the left and the mean is smaller than the median.


Symmetric Distribution

The data values are evenly distributed on both sides of the mean. In a symmetric distribution, the mean is the median.


Weighted Mean

The mean when each value is multiplied by its weight and summed. This sum is divided by the total of the weights.


Range

The difference between the highest and lowest values. Max – Min


Statistic

Characteristic or measure obtained from a sample


Population Variance

The average of the squares of the distances from the population mean. It is the sum of the squares of the deviations from the mean divided by the population size. The units on the variance are the units of the population squared.


Sample Variance

Unbiased estimator of a population variance. Instead of dividing by the population size, the sum of the squares of the deviations from the sample mean is divided by one less than the sample size. The units on the variance are the units of the population squared.


Standard Deviation

The square root of the variance. The population standard deviation is the square root of the population variance and the sample standard deviation is the square root of the sample variance. The sample standard deviation is not the unbiased estimator for the population standard deviation. The units on the standard deviation is the same as the units of the population/sample.


Coefficient of Variation

Standard deviation divided by the mean, expressed as a percentage. We won’t work with the Coefficient of Variation in this course.


Chebyshev’s Theorem

The proportion of the values that fall within k standard deviations of the mean is at least 1 – 1/k^2where k > 1. Chebyshev’s theorem can be applied to any distribution regardless of its shape.


Empirical or Normal Rule

Only valid when a distribution in bell-shaped (normal). Approximately 68% lies within 1 standard deviation of the mean; 95% within 2 standard deviations; and 99.7% within 3 standard deviations of the mean.


Standard Score or Z-Score

The value obtained by subtracting the mean and dividing by the standard deviation. When all values are transformed to their standard scores, the new mean (for Z) will be zero and the standard deviation will be one.


Percentile

The percent of the population which lies below that value. The data must be ranked to find percentiles.


Quartile

Either the 25th, 50th, or 75th percentiles. The 50th percentile is also called the median.


Decile

Either the 10th, 20th, 30th, 40th, 50th, 60th, 70th, 80th, or 90th percentiles.


Lower Hinge

The median of the lower half of the numbers (up to and including the median). The lower hinge is the first Quartile unless the remainder when dividing the sample size by four is 3.


Upper Hinge

The median of the upper half of the numbers (including the median). The upper hinge is the 3rd Quartile unless the remainder when dividing the sample size by four is 3.


Box and Whiskers Plot (Box Plot)

A graphical representation of the minimum value, lower hinge, median, upper hinge, and maximum. Some textbooks, and the TI-82 calculator, define the five values as the minimum, first Quartile, median, third Quartile, and maximum.


Five Number Summary

Minimum value, lower hinge, median, upper hinge, and maximum.


InterQuartile Range (IQR)

The difference between the 3rd and 1st Quartiles.


Outlier

An extremely high or low value when compared to the rest of the values.


Mild Outliers

Values which lie between 1.5 and 3.0 times the InterQuartile Range below the 1st Quartile or above the 3rd Quartile. Note, some texts use hinges instead of Quartiles.


Extreme Outliers

Values which lie more than 3.0 times the InterQuartile Range below the 1st Quartile or above the 3rd Quartile. Note, some texts use hinges instead of Quartiles.


Homepage