A sampling technique where a group of subjects (a sample) for study is selected from a larger group (a population). Each individual is chosen entirely by chance and each member of the population has a known, but possibly non-equal, chance of being included in the sample. There may often be factors which divide up the population into sub-populations (groups / strata) and we may expect the measurement of interest to vary among the different sub-populations. This may be addressed by using a stratified sampling design. A stratified sample is obtained by taking samples from each stratum or sub-group of a population and within each stratum selecting sites using an independent random sample (i.e., a simple random sample). When we sample a population with several strata, allocating the sample in proportion to the relative sizes of each stratum will result in estimates with lower precision. Stratified sampling techniques are generally used when the population is heterogeneous, or dissimilar, where certain homogeneous, or similar, sub-populations can be isolated (strata). Variable probability sampling is an alternative to stratified sampling when explicit strata are not of interest and it is possible to have the probability of selecting a site be proportional to the measurement of interest. Simple random sampling is most appropriate when the entire population from which the sample is taken is homogeneous. Some reasons for using stratified sampling over simple random sampling are:
- the cost per observation in the survey may be reduced;
- estimates of the population parameters may be wanted for each sub-population;
- increased accuracy at given cost.
Tools:
Software to implement such a design is available on the Aquatic Resource Monitoring web site or directly from R project package spsurvey.
Pros and Cons:
The following pros and cons of independent random sample with stratified and/or variable probability designs should assist you in determining if it is appropriate for your monitoring needs.
Site selection
Pros:
- Can ensure monitoring design includes important subpopulations
- Provides representative sample of the target population within each stratum
- Does incorporate some characteristics of, or information known about, population by using strata
- Can reduce cost to obtain same sampling error compared to non-stratified sample
- Available sample selection procedures are relatively easy to implement
- Allow replacement of sites if sites are dropped (for valid reasons)
Cons:
- Requires additional information about the target population to define the strata
- May not know if homogeneous subgroups exist to define strata or if the homogeneous subgroups are the same for all measurements
- Does not ensure spatial balance of sample
- Can increase field operation costs due to inaccessibility of sites
- Errors in the sampling frame may result in the sampling frame excluding some sites that are in the target population or including some sites that are not in the target population
- Sampling frames based on different GIS scales (e.g., 1:100000 and 1:24000) may result in different estimates for the target population
Statistical Inference
Pros:
- Procedures for estimating characteristics of the target population are well-known
- Result in unbiased estimates and associated unbiased estimates of variance
- Precision/power depends on sample size and metric variability
- May reduce sampling variance when strata are constructed for that purpose
- Sampling equal numbers from strata that vary widely in physical size may be used to improve the statistical power of tests for differences between strata
Cons:
- Require additional information about population to define strata for all elements of the population
- May increase sampling error compared to non-stratified sample if stratification does not result in homogeneous subgroups for all variables
- Can be expensive to achieve desired sampling error
Site selection
Pros:
- Can ensure monitoring design includes important subpopulations
- Provide representative samples of the target population
- Do incorporate some characteristics of, or information known about, population by using variable probability of selection based on those characteristics
- Can reduce cost to obtain same sampling error compared to independent random sample with no stratification or variable probability of selection
Cons:
- Require additional information about population to define auxiliary variable on all elements of the population
- Can be difficult to select auxiliary variable that is positively correlated to all response variables
- Errors in the sampling frame may result in the sampling frame excluding some sites that are in the target population or including some sites that are not in the target population
- Sampling frames based on different GIS scales (e.g., 1:100000 and 1:24000) may result in different estimates for the target population
Statistical Inference
Pros:
- Result in unbiased estimates and associated unbiased estimates of variance
- Precision/power depends on sample size and metric variability
- May reduce variance estimates when auxiliary variable is positively correlated with response variables
Cons:
- Procedures for estimating characteristics of the target population are not well-known
- May increase sampling error compared to equal probability sample if is not positively correlated to response variables
- Can be expensive to achieve desired sampling error
Next: Spatial Design Results and Next Steps | Go Back