|
PAGE ONE
A Basic
Guide to Statistics
by Jonathan Dolhenty, Ph.D.
Index:
At least an elementary knowledge of statistics
is necessary for every critical thinker in today's
world. Statistical information of all sorts is
presented almost daily in the media.
Statistical method is a technique used to
obtain, analyze and present numerical data. The
elements of statistical technique include the
following:
- Collection and assembling of data;
- Classification and condensation of
data;
- Presentation of data in textular form,
tabular form, or graphic form;
- Analysis of data.
The characteristics and limitations of
statistical methods should be kept in mind.
Statistical method is the only means for handling
large masses of numerical data. Statistical
technique applies only to data which are reducible
to quantitative form, that is, the data must be
capable of being expressed as measurements of some
sort. Statistical technique is objective;
the results, however, cannot but be affected by the
necessarily subjective interpretation.
Statistical technique is the same for the social
sciences as for the physical sciences. Statistical
technique is commonly used in economics, education,
sociology, psychology, biology, chemistry,
astronomy, and so on. Method and technique apply
alike to these divergent fields.
Variables
and Constants
A basic understanding of statistics demands a
basic understanding of what are called
variables and what are called
constants.
A statistical study is based on some observation
of behavior wherein the objects of study change
from time to time and, in the case of human
subjects, from person to person. Traits which are
capable of variation are called variables. Examples
of such traits include intelligence, height,
actions, opinions, political party membership,
church attendance, hair color, and so forth.
Statistical studies are based on the relationships
between variables.
Population is a statistical term that
refers to the entire group of observations under
study. (Sometimes the word universe is used.) Any
study of a population will describe that population
in terms of characteristics that the members have
in common as well as those that vary. Those
characteristics which do not vary from individual
to individual within the population being studied
are called constants.
A distinction may be made between continuous
variables and discrete variables. A
continuous variable may take any value within a
defined range of values. Between any two values of
the variable an indefinitely large number of
in-between values may occur. Examples of continuous
variables include intelligence, height, weight, and
chronological time.
A discrete variable can take specific values
only. For instance, size of family is a discrete
variable. A family may include 2, 3, or 4 children,
but values between these are not possible. A family
cannot have 2 and 1/2 children! Other examples of
discrete variables are the population of a city and
the score of a baseball game. Have you ever heard
of a baseball game with the score of 5 1/2 to 7
1/4?
Some statistical procedures are appropriate for
use with continuous variables, while others are
appropriate only for use with discrete
variables.
A distinction may also be made between variables
which vary in quality and variables which vary in
quantity. Eye color and degree of aggressiveness,
for example, are qualitative variables, while
intelligence and temperature are quantitative
variables.
|
If a survey of males between the
ages of 21 and 60 residing in Oregon was
taken to find out their opinions about
income taxes, the
constant
would be
between the
ages of 21 and 60 and residing in the
state of Oregon. The
variable
would be
their opinion
about income taxes.
|
Measurement
Scales
Measurement is the assignment of numbers
to objects or events according to certain
prescribed rules. It is common practice to use four
kinds of scales to describe the varying levels of
measurement. Each type of scale has a different set
of rules.
Nominal
Scales
This is the simplest form of measurement; an
object is simply placed into a category according
to some means of classification. The object or
event either is or is not a member of the category
being considered.
For example, individuals may be classified
according to the color of their eyes. Each
individual is either in or out of a specific
category of eye color. Dogs may be classified
according to the categories of hunting, working,
herding, and so forth. Each dog is in a specific
category and out of any other category.
A nominal variable is a characteristic of
the members of a group defined by an operation
which permits the making of statements only of
equality or difference. We may state that one
member is the same or different from another member
with respect to the characteristic under
consideration. A specific dog, for instance, is
either the same or different from some other dog as
to the category of hunting dogs.
Ordinal
Scales
Ordinal scales permit the establishment of
orders among categories. While nominal scales show
that things are different, ordinal scales show the
direction of the difference.
Statements about ordinal variables
include not only statements about "same as" and
"different from," but also statements of the kind
"greater than" or "less than."
Thus, individuals can be ranked, for instance,
according to the characteristic of weight. We
cannot, however, make a statement regarding the
amount of difference between the rankings. Four
people may be ranked according to their
cooperativeness, but the individual who ranks
second is not twice as cooperative as individual
number 1. Again, we can rank individuals according
to height, but individual number 2 is not twice as
tall as individual number 1!
Nominal and ordinal scales have important uses,
particularly in studies about human behavior. The
statistical techniques which have been developed
for use with these scales are called
nonparametric statistics.
Interval
Scales
Interval scales have equal intervals between the
units of measure. For example, the temperature
scale on a thermometer is an interval scale. A
temperature of 60 degrees F. is halfway between 50
degrees F. and 70 degrees F.
It should be noted that interval scales do not
have a true "zero" point. A zero point, however,
may be arbitrarily defined as a convenience, such
as we do with thermometers and calendars and
intelligence scores (it is meaningless to think
about an individual with zero intelligence).
A word of warning about interval scales should
be given here. Let's consider three temperatures:
"A" = 12 degrees, "B" = 24 degrees, and "C" = 36
degrees. It is proper to that the difference
between temperatures "A" and "B" is equal to the
difference between temperatures "B" and "C." It is
also proper to say that the difference between "A"
and "C" is twice the difference between "A" and "B"
or "B" and "C." But it is NOT proper to say
that "B" has twice the temperature of "A" or that
"C" has three times the temperature of "A." If the
temperature outside today was 70 degrees, and the
temperature yesterday was 35 degrees, we would not
ordinarily say that today it is twice as hot as
yesterday.
An interval variable, then, is a
characteristic defined by an operation which
permits the making of statements of equality of
intervals, in addition to statements of sameness or
difference or greater than or less than. Other
examples of interval variables are calendar time
and the scores on intelligence tests.
Ratio
Scales
Ratio scales have the same qualities as interval
scales, plus they have the property of an absolute
zero point. Therefore, a ratio variable is a
property defined by an operation which permits the
making of statements of equality of ratios in
addition to all other kinds of statements already
discussed.
This means that one variate measurement may be
said to be double or triple another, and so on. An
absolute zero is always implied.
Variables such as height, weight, and age are
ratio variables and can be expressed on
ratio scales. A person of 50 years of age is twice
the age of a person of 25. A person weighing 300
pounds is three times the weight of a person
weighing only 100 pounds.
A word of warning, however. This cannot be said
of IQ scores! We cannot say that a person with an
IQ of 100 is twice as intelligent as a person with
a score of 50. "Zero intelligence" cannot be
defined!
The essential difference between a ratio and an
interval variable is that for the ratio variable
the measurements are made from a true zero point,
whereas for the interval variable the measurements
are made from an arbitrarily defined zero point.
Statistical procedures used with ratio scales will
usually be the same as those used with interval
scales. These techniques are referred to as
parametric statistics.
Measures
of Position
Once the data from a study or survey are
collected, they must be organized. This unorganized
information is sometimes referred to as the raw
data. Only when the data are organized, can
they then be analyzed.
We will consider only three ways of classifying
data: frequency distributions, simple ranking, and
percentile ranking. These are the ones most
ordinary people are somewhat acquainted with and
these are commonly used in presenting the results
of school test scores and opinion surveys.
Frequency
Distribution
A frequency distribution arranges a
collection of measures in graphic form to indicate
the frequency of occurrence of each value. The
number of times a particular score value occurs is
a frequency.
|
The procedures for constructing a
frequency distribution are:
1. Make a score column where you list
the raw scores from high to low.
2. Tally the number of times each score
appears in the distribution.
3. Check the number of tallies against
the number of raw scores.
4. Make a frequency column showing the
number of tallies for each score.
(Statisticians usually use the symbol
f to represent
frequency.)
|
|
Simple
Ranking
Using a frequency distribution, additional
analysis may be performed through ranking the data.
Each score is given a position from high to low,
with 1 indicating the highest rank.
Sometimes there will be more than one incident
of the same score, that is, two or more scores may
occur in one position. In such a case, the scores
are averaged to determine their rank.
|
In the example at the right, only one
score occupies the highest rank and that
is designated "1." Two scores occupy
positions 2 and 3. So we add 2 and 3 and
get 5. This is divided by 2 (averaged) and
we get the rank of 2.5
Four scores occupy the positions 4, 5,
6, and 7. So we add 4+5+6+7 and get 22. We
divide 22 by 4 (averaged) and we get the
rank of 5.5. And so on.
The rank of the last score should equal
the total number (N) of cases which
is 40 students.
|
|
Percentile
Ranking
Parents often see their child's standardized
test score given as a percentile rank. It is
important to understand what this means. A score on
a test, or any other measuring instrument for that
matter, has little meaning unless it is related to
other scores.
The percentile rank of a raw score is the
percentage of scores below the specific score being
considered. The percentile rank is a relative rank,
a rank order score based on a scale of 100. Thus,
the percentile rank is a point on a scale ranging
from 1 to 100.
An example may help. If 75 percent of a group of
students score below a certain student, that
student's percentile rank is 75.
The centile is the point on a raw score
scale which corresponds to a given percentile rank.
An IQ score of 100, for example, is normally the
fiftieth centile. The fiftieth centile is 100. The
percentile rank of an IQ score of 100 is normally
50.
To Page 2
Enrich
Your Life With a Philosophy Book...
Enrich
Your Life With a Philosophy
Magazine...
|
Academy
Showcase Specials
|
|
|
|
|
|
|