Linkage and haplotype
Variants that lie close together on a chromosome are mostly inherited as a block because recombination rarely separates them. This relationship is called linkage, such a block a haplotype. That is why a single tag SNP can stand in for many variants, the basis of microarrays and GWAS.
Linkage and recombination
When egg and sperm cells form, the chromosome pairs exchange pieces, that is recombination. But it does not cut the genome arbitrarily finely: two sites lying close together almost always stay together and are passed on jointly. This tendency is called linkage disequilibrium.
Haplotypes and tag SNPs
Because neighbouring variants stay together, the genome falls into blocks, the haplotypes. Within a block one variant is often highly predictable from another. It is therefore enough to measure one representative marker per block, the tag SNP. From it the remaining variants of the block can be inferred.
Why Genome uses this
A microarray does not determine every base but a selected set of tag SNPs. Through linkage, further variants that were not measured directly can be filled in computationally. The same principle carries genome-wide association studies and polygenic scores: a few markers stand for many.
What Genome measures. A microarray reads selected tag SNPs. Through linkage they also represent neighbouring variants. That is how a chip with a few hundred thousand markers covers large parts of the genome.
Related topics
Sources
- 1International HapMap Consortium, 2005 A haplotype map of the human genome. Nature 437:1299–1320. doi.org/10.1038/nature04226
- 2Gabriel et al., 2002 The structure of haplotype blocks in the human genome. Science 296:2225–2229. doi.org/10.1126/science.1069424