# Sampling error

It is a statistical error to which an analyst exposes a model simply because he or she is working with sample data rather than population or census data. Using sample data presents the risk that results found in an analysis do not represent the results that would be obtained from using data involving the entire population from which the sample was derived.

The use of a sample relative to an entire population is often necessary for practical and/or monetary reasons. Although there are likely to be some differences between sample analysis results and population analysis results, the degree to which these can differ is not expected to be substantial.

## SAMPLING ERROR In Research- Definition

Sampling error is the deviation of the selected sample from the true characteristics, traits, behaviours, qualities or figures of the entire population.

**by Joan Joseph Castillo (2009)**

Sampling error arises from estimating a population characteristic by looking at only one portion of the population rather than the entire population. It refers to the difference between the estimate derived from a sample survey and the ‘true’ value that would result if a census of the whole population were taken under the same conditions. There is no sampling error in a census because the calculations are based on the entire population.

** In nursing research**, a sampling error is the difference between a sample statistic used to estimate a population parameter and the actual but unknown value of the parameter (**Bunns & Grove, 2009**).

**WHY DOES THIS ERROR OCCUR?**

Sampling process error occurs because researchers draw different subjects from the same population but still, the subjects have individual differences. Keep in mind that when you take a sample, it is only a subset of the entire population; therefore, there may be a difference between the sample and population.

The most frequent cause of the said error is a biased sampling procedure. Every researcher must seek to establish a sample that is free from bias and is representative of the entire population. In this case, the researcher is able to minimize or eliminate sampling error.

Another possible cause of this error is chance. The process of randomization and probability sampling is done to minimize sampling process error but it is still possible that all the randomized subjects are not representative of the population.

The most common result of sampling error is systematic error wherein the results from the sample differ significantly from the results from the entire population. It follows logic that if the sample is not representative of the entire population, the results from it will most likely differ from the results taken from the entire population.

### SAMPLE SIZE AND SAMPLING ERROR

Given two exactly the same studies, same sampling methods, same population, the study with a larger sample size will have less sampling process error compared to the study with smaller sample size. Keep in mind that as the sample size increases, it approaches the size of the entire population, therefore, it also approaches all the characteristics of the population, thus, decreasing sampling process error.

### STANDARD DEVIATION AND SAMPLING ERROR

Standard deviation is used to express the variability of the population. More technically, it is the average difference of all the actual scores of the subjects from the mean or average of all the scores. Therefore, if the sample has high standard deviation, it follows that sample also has high sampling error.

It will be easier to understand this if you will relate standard deviation with sample size. Keep in mind that as the sample size increases, the standard deviation decreases.

Imagine having only 10 subjects, with this very little sample size, the tendency of their results is to vary greatly, thus a high standard deviation. Then, imagine increasing the sample size to 100, the tendency of their scores is to cluster, thus a low standard deviation.

**CHARACTERISTICS**

Sampling error

- generally decreases as the sample size increases (but not proportionally)
- depends on the size of the population under study
- depends on the variability of the characteristic of interest in the population
- can be accounted for and reduced by an appropriate sampling plan
- can be measured and controlled in probability sample surveys.

**Sample size**

As a general rule, the more people being surveyed (sample size), the smaller the sampling error will be. Many people are surprised by the small size of well-known surveys. For example, polls that try to predict voting patterns are taken from sample sizes ranging from 1,000 to 2,000 people, with samples of about 1,000 people being the most common. Ratings for television programs are estimated from approximately 2,000 viewers. This small sample represents the television preferences of a total population of 12 million Canadian households! Despite a widely-held perception that such polls are reliable, some statisticians question their accuracy because of the small sample size.

If one of the survey objectives is to look at sub-populations or measure rare events, then a larger sample will be needed. However, it is important to note that increasing the sample size also means increasing costs.

**Population size**

Except for very small populations where the relationship is more direct, the size of a sample does not increase in proportion to the size of the population. In fact, the population size plays an almost non-existent role as far as large populations are concerned.

**Variability of the characteristic of interest**

In general, the greater the difference between the population units, the larger the sample size required to achieve a specific level of reliability. For example, if you were to conduct a survey on work environments for a population where the income varies from $30,000 to $50,000, you would use a smaller sample size to achieve the same level of reliability than you would use for a population of equal size for which income varies from $5,000 to $1,000,000.

**Sampling plan**

It is important to develop an efficient sampling plan, which includes a sample design and an estimation procedure. The method of sampling, called “sample design”, can greatly affect the size of the sampling error. Many surveys involve a complex sample design that often leads to more sampling error than a simple random sample design. The estimation procedure also has a major impact on the sampling error.

**Measuring sampling errors**

There are methods that estimate sampling error for probability sample surveys. The sampling variance is the most commonly used measure to quantify sampling error, and like the other methods, it is derived directly from the sampling and estimation methods used in the survey.

**WAYS TO ELIMINATE SAMPLING ERROR**

There is only one way to eliminate this error. This solution is to eliminate the concept of sample, and to test the entire population.

In most cases this is not possible; consequently, what a researcher must to do is to minimize sampling process error.

This can be achieved by a proper and unbiased probability sampling ensuring that the sample adequately represents the entire population.

And by using a large sample size.