POPULATIONS AND SAMPLES
Populations
In statistics the term
"population" has a slightly different meaning from the one given to
it in ordinary speech. It need not refer only to people or to animate creatures
- the population of Britain, for instance or the dog population of London.
Statisticians also speak of a population of objects, or events, or procedures,
or observations, including such things as the quantity of lead in urine, visits
to the doctor, or surgical operations. A population is thus an aggregate of
creatures, things, cases and so on.
Although a statistician should
clearly define the population he or she is dealing with, they may not be able
to enumerate it exactly. For instance, in ordinary usage the population of
England denotes the number of people within England's boundaries, perhaps as
enumerated at a census. But a physician might embark on a study to try to
answer the question "What is the average systolic blood pressure of
Englishmen aged 40-59?" But who are the "Englishmen" referred to
here? Not all Englishmen live in England, and the social and genetic background
of those that do may vary. A surgeon may study the effects of two alternative
operations for gastric ulcer. But how old are the patients? What sex are they?
How severe is their disease? Where do they live? And so on. The reader needs
precise information on such matters to draw valid inferences from the sample
that was studied to the population being considered. Statistics such as
averages and standard deviations, when taken from populations are referred to
as population parameters. They are often denoted by Greek letters: the
population mean is denoted by μ(mu) and the standard deviation denoted by ς
(low case sigma)
Samples
A population commonly contains
too many individuals to study conveniently, so an investigation is often
restricted to one or more samples drawn from it. A well chosen sample will
contain most of the information about a particular population parameter but the
relation between the sample and the population must be such as to allow true
inferences to be made about a population from that sample.
Consequently, the first important
attribute of a sample is that every individual in the population from which it
is drawn must have a known non-zero chance of being included in it; a natural
suggestion is that these chances should be equal. We would like the choices to
be made independently; in other words, the choice of one subject will not
affect the chance of other subjects being chosen. To ensure this we make the
choice by means of a process in which chance alone operates, such as spinning a
coin or, more usually, the use of a table of random numbers. A limited table is
given in the Table F (Appendix), and more extensive ones have been
published.(1-4) A sample so chosen is called a random sample. The word
"random" does not describe the sample as such but the way in which it
is selected.
To draw a satisfactory sample
sometimes presents greater problems than to analyse statistically the
observations made on it. A full discussion of the topic is beyond the scope of
this book, but guidance is readily available (1)(2). In this book only an
introduction is offered.
WHY SAMPLE IS USED INSTEAD OF ENTIRE POPULATION IN A STUDY
Why not use the entire population to draw our conclusions? This is
a very good question that a smart researcher would ask. But when dollars are
tight, human resources are limited, and time is of the essence, sampling is a
wonderful option. And the reason is that for most purposes we can obtain
suitable accuracy quickly and inexpensively on information gained from a
sample.
The bottom line is it would be wasteful and foolish to use the
entire population when a sample, drawn scientifically, provides accuracy in
representing your population of interest. Assessing all individuals may be
impossible, impractical, expensive or even inaccurate.
Here are some reasons why doctoral students should not even try to
use the entire population in their dissertation research.
*We hardly ever know who makes up the entire population.
*It is too costly in terms of human resources and other expenses.
*It is time consuming and costly.
*There is a lot of error to control and monitor.
*Lists are rarely up to date.
References
1.
Altman
DG. Practical
Statistics for Medical Research.London: Chapman & Hall, 1991
2.
Armitage
P, Berry G. Statistical Methods in Medical Research.Oxford:
Blackwell Scientific Publications, 1994.
3.
Campbell
MJ, Machin D. Medical Statistics: A Commonsense Approach.2nd
ed. Chichester: John Wiley, 1993.
4.
Fisher
RA, Yates F. Statistical Tables for Biological,
Agricultural and Medical Research,6th ed. London: Longman, 1974.
5.
Strike
PW. Measurement and control. Statistical Methods in Laboratory Medicine.Oxford:
Butterworth-Heinemann, 1991:255.
Comments
Post a Comment