Sampling Error Exercise

Introduction. By now you have been introduced to the statistical definitions of the terms “population” and “sample”. It is also likely that you have encountered the concept of “sampling error”. This concept is difficult to understand unless you actually have some experience dealing with it. Consequently, we developed this exercise to help you learn about sampling error; in particular we want you to learn the general relationship between sample size and sampling error.

Procedure.

Work in three groups. Each group will be given two beakers containing colored, wooden buttons that are colored on the flat side with numbers written on the rounded side. The colors are to help prevent mixing up the sets of buttons. Assume that each button represents a person, and the number on the button represents their height measured in cm. Each beaker contains a population of people. You will take several samples from these populations.

Put the results of your sampling in the table that your instructor has put on the board. The table on the board should appear as follows:

Mean height +/- SEM (cm)

Sample size         population 1      population 2

                    1

                    5

10

                    20

                    30

Your first samples should consist of only one individual from each population (beaker). Obviously, you don’t need to calculate a mean here; simply write the height of the one individual in the appropriate mean height column. Before you proceed to the next bout of sampling, look over the samples taken by the entire class. Based on the samples, can you make any conclusion about the heights of people in population 1 vs those in population 2? Are the people in one population taller?

Next, take a sample of five individuals from each population. Calculate the mean height and the standard error of the mean of each sample, and put these on the board. Once again, look at the class results. What can you conclude?

Repeat this sampling procedure and class comparison procedure for samples of 10 and 20 individuals. This time enter your data on a spreadsheet using the statistical program called Statview. This graphics and statistics software has been installed on the INTSCI computers. Your instructor will discuss the use of Statview. Using Statview, construct frequency distributions (histograms) of your data for your two samples. Then, use Statview to calculate the descriptive statistics (mean and standard deviation) of these samples. Put these on the board.

Finally, enter on a spreadsheet the data for all 30 individuals in each population. After each bout of sampling, discuss the class results. Construct histograms that show the height frequency distributions for both populations. Also calculate the mean and standard deviation for each population.

Conclusion. What is sampling error? Does the existence of sampling error suggest that whoever did the sampling did something wrong? Does sample size influence sampling error? If so, what is the general relationship?