Definition: Cluster sampling is a probability sampling technique where the population is divided into naturally occurring, heterogeneous groups called clusters, and a random sample of these clusters is selected. All individuals or units within the chosen clusters are then included in the study.
This method is particularly useful in public health research when a complete list of individual population members is either unavailable or impractical to obtain, or when the population is geographically widespread. The process typically involves defining clusters based on existing geographical boundaries (e.g., villages, census tracts, schools, households) or administrative units. A random sample of these clusters is then drawn. In a single-stage cluster sample, every individual within the selected clusters is included in the sample. A more common approach, especially for large-scale studies, is two-stage cluster sampling, where a random sample of individuals is subsequently drawn from within each selected cluster.
Cluster sampling offers significant advantages in terms of cost-effectiveness and logistical feasibility, especially for large-scale public health surveys or interventions spanning wide geographical areas. For instance, it is a standard method for estimating vaccination coverage (e.g., WHO EPI surveys) or assessing health needs in disaster-affected regions, where compiling a list of every individual is impossible. While efficient, a key consideration is that individuals within the same cluster often share similar characteristics, leading to a higher sampling error (known as the design effect) compared to simple random sampling. Researchers must account for this by either increasing the overall sample size or using appropriate statistical methods during analysis to ensure the precision of their estimates.
Key Context:
- Design Effect: A measure of the increase in variance (and thus the required sample size) resulting from the use of a complex sample design, such as cluster sampling, compared to simple random sampling.
- Multi-stage Sampling: An extension of cluster sampling where, after selecting primary clusters, secondary clusters (e.g., households within villages) or individuals are then randomly sampled from within the chosen primary clusters.
- Sampling Frame: In cluster sampling, the sampling frame is typically a list of the clusters themselves, rather than a list of individual population units.