|
PAGE SIX
A Basic
Guide to Statistics
by Jonathan Dolhenty, Ph.D
Statistical
Sampling
A population (or universe) is any
defined aggregate of objects, persons, or events,
the variables used as the basis for classification
or measurement being specified. For instance, we
may want to survey the attitude of adults over the
age of 21 in the state of California towards the
legalization of marijuana. Our population (or
universe) for our survey would be all adults in the
state of California over the age of 21.
Of course, if we tried to survey each and every
adult over the age of 21 in California, our task
would be virtually impossible. How would we allow
for adults over 21 who left the state just before
or after we took our survey? What would we do about
the adults who are moving to the state on a daily
basis? How would we ever complete our survey?
This is precisely where statistical sampling
comes into the picture. A sample is any
subaggregate drawn from the population. It is a
subset of the population. It is through using these
samples, drawn from a defined population, that we
can begin and complete a survey even while the
population remains unstable.
You probably have seen statistics used during a
news broadcast on television. The reporter will
state, for instance, that, "according to a survey
just completed, 78% of Americans indicated they
oppose the legalization of marijuana." Now
obviously, whoever did the survey did not talk to
each and every American citizen in the country.
This is where statistical sampling enters the
picture.
A subgroup of the population, called the sample,
is assumed to approximate the larger group (the
population) on whatever is to be studied, such as
attitudes toward the legalization of marijuana.
Once the sample has been drawn and surveyed, it
ceases to be of any interest, since whatever
attitude is found is assumed to reflect that of the
population, the larger group.
There is, of course, some potential error here,
and that is allowed for by indicating a measure of
error of so-many points. For instance, our reporter
above may say, "these results are accurate within
an error of plus or minus 3 points."
The important particular here is that
whatever is found out about the sample is
generalized to the defined population.
Important
Considerations
There are some important considerations to be
aware of in statistical sampling. The first task of
the people involved in any study is to define the
population of the study both in terms of numbers
involved and the distribution of the characteristic
that will be involved in the study. The second task
is to obtain a sample that approximates the
targeted population and here two conditions must be
met:
- (1) there must be equal chance; and
- (2) there must be independence.
Equal chance means that every member in
the defined population must be given an equal
chance of becoming a part of the sample. The sample
must be representative of the defined population.
If certain portions of the population are excluded
from being chosen, the sample becomes biased.
Independence means that the selection of one
individual for a sample must not be dependent on
the selection of another individual. This latter
condition is usually not a problem if equal chance
is strictly adhered to in drawing the sample.
Sample
Size
First, it needs to be noted that any sample that
numbers less than the total population will produce
some degree of error. That being said, it follows
that precision and accuracy increase as the size of
the sample approaches the size of the total
population.
The purpose of sampling is to get a
mini-population that is representative of the
larger population. The sample must contain a
distribution of the characteristic under study
approximately the same as it actually exists in the
total population.
Therefore, the size of the sample must be large
enough to insure the probability of including the
extremes in the population. There is a mathematical
formula which statisticians can use to help them
generate a table for determining sample size from a
given population.
Methods of
Sampling
Several methods are available to insure the
probability of obtaining a representative sample in
a statistical study. We will briefly describe three
of them here.
Random sampling is the most common and
efficient method of obtaining a representative
sample if the characteristics under study are
assumed to be normally distributed throughout the
population.
The important consideration in random sampling
is that each member of the defined population must
be given an equal chance of being selected for the
sample. There are several ways to accomplish
this.
The least time consuming is to use a table of
random numbers. This is a randomly generated set of
numbers which has no order or structure. First,
each member of the population is assigned a number.
Then, after the appropriate size of the sample has
been determined, you select those members of the
population whose numbers occur first as you read
down a column or across a row of the table of
random numbers.
If the population size was 1,000, the numbering
would range from 001 to 1,000. The actual sample
selection would involve beginning with any three
digits and proceeding through the table in an
orderly fashion. The justification for using a
table of random numbers is that the numbers do not
follow any set pattern. Each number has an equal
chance of being selected.
There are also other ways of obtaining a random
sample, including drawing cards from a container
with numbers written on them. But using a table of
random numbers has proven to be efficient and easy
to implement.
Stratified random sampling is another
method used to draw a sample. This is especially
appropriate where variables under study may not be
normally distributed in the population. The
procedure here is to divide the population into
natural groupings prior to sampling. It is hoped
that in this way a more representative sample is
obtained.
Let's go back to our legalization of marijuana
example. In an attempt to determine the attitude of
adults over 21 in California towards the
legalization of marijuana, the age of a person
might have something to do with the way he or she
feels about legalization. If this is the case, the
population could be subdivided according to age
groups prior to the sampling.
Many surveys attempting to gauge attitudes about
the performance of the sitting U.S. president break
down the population into, for instance, Democrats,
Republicans, and Independents. Then a random sample
(hopefully!) is drawn from each subgroup.
Cluster sampling involves subdividing the
population into clusters or large blocks of
individuals prior to sampling. A cluster might be
defined as neighborhoods within a city, or rural
regions within a geographical area, or fire
districts within a county.
To Page 7
Enrich
Your Life With a Philosophy Book...
Enrich
Your Life With a Philosophy
Magazine...
|
Academy
Showcase Specials
|
|
|
|
|
|
|