|
PAGE TWO
A Basic
Guide to Statistics
by Jonathan Dolhenty, Ph.D.
Measures
of Central Tendency
The average is a measure of central
location. Measures of average allow us to
refine our analysis of group data in order to
describe the group performance as a whole.
The word "average" is commonly used to refer to
a value obtained by adding together a set of
measurements and then dividing by the number of
measurements in the set. You have probably done
this yourself and referred to it as the "mean." But
this is only one type of average as you shall
see.
The Arithmetic
Mean
This is the most common measure of central
tendency and, as has been said above, this is the
one you have probably calculated yourself. The
formula is simple. Add up all the observations and
divide the answer (the sum) by the number of
observations.
Although this is the most common measure of
central tendency, there is a problem connected with
it. It is subject to the influence of extreme
observations.
|
If we add 2, 4, 5, 6, 7, and 6, we get
the sum of 30. Divide 30 by the number of
observations, which is 6, and we get 5.
This number is the arithmetic mean for the
series of numbers. When we look at the
group of numbers as a whole, the average
of 5 looks pretty good.
But let's build in an extreme number in
our set of numbers. If we add 2, 4, 5, 6,
7, and 42, we get the sum of 66. Divide 66
by the number of observations, which is 6,
and we get 11. The "extreme" number, 46,
has bumped the average up to 11. This
tendency of extreme numbers to influence
the average in one direction or another
can generate problems with certain
statistical analyses.
At the right is the mathematical
formula for calculating the
mean.
|
|
The
Median
The median is also an average. It is that
point in a distribution where half of the
observations fall above it and half of the
observations fall below it.
Consider this series of numbers: 2, 10, 16, 20,
and 28. The median is the middle point, which is
16. There are two observations above it and two
observations below it. This, of course, is easy to
figure when you have a small group of numbers and
an odd number of observations.
Consider this series of numbers: 2, 10, 16, 18,
20, and 28. Here we have an even number of
observations. What is the median? To determine
this, find the two middle observations and insert a
mid-point. The mid-point in this case would be
between 16 and 18, which are the two middle
observations. The median, therefore, would be
17.
|
Now consider this series of numbers: 2,
10, 16, 19, 20, and 28. We still have an
even number of observations. The two
middle observations are 16 and 19. What is
the median? The mid-point between these
two would be 17.5 or 17 1/2, which is the
median of the set of numbers.
This works well with small groups of
observations. What do you do with a large
number of observations? If the
distribution of observations has been
organized in a frequency distribution,
there is an easy way to determine the
median. See the box.
The median is useful for describing
certain kinds of data because it is not
affected by extreme scores.
|
|
The Mode
The mode refers to the observation
occurring most often in a distribution. It is
determined by simple observation.
Consider these observations: 11, 11, 12, 12, 12,
13, 13, 13, 13, 13, 14, 14, 14, 15, 15, 15, 16, 16,
17, 17, and 18. Which value occurs more frequently?
The value 13 occurs five times, more than any other
value, and, therefore, 13 is the mode.
It is possible that a distribution may have more
than one mode, and such a distribution is called
bimodal. For instance, if we added two more 12s to
the distribution above, it would have 12 appearing
five times and 13 appearing five times. Both 12 and
13 would be modes and the distribution would be
bimodal.
It is also possible that no mode can be
calculated in a distribution. In a situation where
all values occur with equal frequency, no modal
value can be calculated. Consider this
distribution: 2, 7, 16, 19, 20, 25, and 27. Which
value occurs more frequently than any other value?
Of course, none does. In this case, no mode can be
calculated. The same thing occurs with this
distribution: 2, 2, 2, 5, 5, 5, 7, 7, 7, 12, 12,
and 12. No one value occurs more frequently than
the others. No mode can be calculated.
The mode is a statistic of limited practical
value. It has little meaning unless the number of
measurements under consideration is fairly
large.
Measures
of Variability
The measures of central tendency have some
limitations with regard to describing group
performance. We need to know how the data are
distributed in order to get a more accurate
picture. We need to know how compact or how
scattered the data are from a specified point.
Measures of variability provide a numerical
index that measures the amount of spread
(also called dispersion) of a set of data.
This makes it possible to judge the amount of
"sameness" or "dissimilarity" of the observations.
When we need to compare one set of data with
another, we need to know the amount and nature of
variability contained in each set of data.
The
Range
The simplest and most rudimentary measure of
variability is the range. This is the difference
between the lowest and the highest observation in
the distribution. The purpose here is to provide a
numerical value indicating the overall spread (or
dispersion) of a set of observations.
Consider this set of measurements: 10, 12, 15,
18, and 20. The lowest measurement is 10 and the
highest is 20. The range is 20 minus 10, or 10.
The range, however, has two disadvantages.
First, for large sets of observations it is an
unstable descriptive measure. Second, the range is
not independent of the size of the set of
observations. But the range can be effectively used
with small numbers of observations.
The Deviation
Score
This is also a simple and useful measure of
variability. The deviation score provides a means
for determining the distance of an individual
observation from the mean.
The formula for determining the deviation score
is:
The Standard
Deviation
The standard deviation is the most precise
measure of variability to be included here. It
takes into account the variability of all the
observations in a distribution.
For those of you who want to learn more about
the advantages obtained by using the standard
deviation and want to know how to calculate the
standard deviation, it is suggested that you
consult an introductory text in statistics.
The big advantage with the standard deviation is
that with standard deviations calculated for two
sets of data, we can more accurately interpret the
means of the two sets of data since we have some
common basis for comparison.
To Page 3
Enrich
Your Life With a Philosophy Book...
Enrich
Your Life With a Philosophy
Magazine...
|
Academy
Showcase Specials
|
|
|
|
|
|
|