Selection Signatures in Four German Warmblood Horse Breeds: Tracing Breeding History in the Modern Sport Horse

The study of selection signatures is crucial for identifying genomic regions that have been influenced by selective pressures, potentially harboring genes that modulate significant phenotypes. This knowledge deepens our understanding of how breeding programs have shaped livestock genomes. This research focuses on 942 stallions from four distinct German warmblood breeds: Trakehner (N=44), Holsteiner (N=358), Hanoverian (N=319), and Oldenburger (N=221). These breeds, while currently bred for athletic performance in disciplines like show-jumping, dressage, and eventing, possess divergent historical and recent selection focuses, alongside varied crossbreeding policies. The Holsteiner breed, in particular, emphasizes show-jumping. Blood samples were collected during routine health examinations prior to stallion licensing and genotyped using the Illumina EquineSNP50 BeadChip. Autosomal markers were employed in a multi-method approach—integrated Haplotype Score (iHS), cross-population Extended Haplotype Homozygosity (xpEHH), and Runs of Homozygosity (ROH)—to detect signals of positive selection. Analyses were conducted both within and across breeds. The Oldenburger and Hanoverian breeds exhibited highly similar iHS signatures, while xpEHH revealed breed-specific differences on multiple chromosomes. The Trakehner breed formed a distinct cluster in principal component analysis and showed the highest number of ROHs, reflecting its historical population bottleneck. Beyond breed-specific variations, a shared selection signal on chromosomes 1, 4, and 7 was identified through an across-breed iHS analysis. Investigating these iHS signals and shared ROHs for potential functional candidate genes and affected pathways, including enrichment analyses, suggests that genes influencing muscle functionality (TPM1, TMOD2-3, MYO5A, MYO5C), energy metabolism and growth (AEBP1, RALGAPA2, IGFBP1, IGFBP3-4), embryonic development (HOXB-complex), and fertility (THEGL, ZPBP1-2, TEX14, ZP1, SUN3, and CFAP61) have been targeted by selection across all investigated breeds. Furthermore, the findings indicate selection pressure on KITLG, a gene well-documented for its role in pigmentation.

Introduction to Warmblood Horse Breeding and Selection

Since the dawn of domestication, humans have selectively shaped animal species to meet their evolving needs. The establishment of studbooks and the definition of explicit breeding goals and programs have significantly intensified selection pressure over time. Horses, initially crucial for warfare, transportation, and farming, have increasingly been bred for competitive sports disciplines such as show-jumping, dressage, and eventing, particularly within the 20th century. German warmblood breeds like the Holsteiner, Hanoverian, Oldenburger, and Trakehner consistently rank among the top international studbooks for these disciplines. The Hanoverian and Oldenburger are the largest breeding associations in Germany, while the Holsteiner and Trakehner rank fourth and sixth, respectively, collectively accounting for two-thirds of the warmblood horse population in Germany.

While these breeds currently share common selection goals related to conformation, locomotion, and aptitude for sport disciplines, each has experienced unique selection pressures throughout its history. The Trakehner breed was initially developed for riding horses, particularly for cavalry, and has maintained a relatively pure lineage for 250 years, with foreign sires seldom incorporated. [cite:3, cite:4] Post-World War II, the Trakehner population underwent a severe bottleneck, reducing its numbers significantly. The Hanoverian breed, originally intended for agriculture and military use, shifted towards lighter riding horses after World War II, incorporating Thoroughbreds and Trakehners into its breeding scheme. The Oldenburger breed, historically favored for carriage driving with heavier warmblood types, remained a closed studbook for a considerable period before focusing on lighter riding horses from the 1950s onwards. [cite:6, cite:7, cite:8] Both Hanoverian and Oldenburger breeds now have specialized breeding programs for show-jumping, though their structures differ. [cite:5, cite:11] The Holsteiner breed, historically draught horses, transitioned to a focus on sports, particularly show-jumping, in the mid-20th century, incorporating Thoroughbreds, Arabians, and other warmblood breeds for refinement. Intensive use of certain sires in the 20th century, especially in the Holsteiner and Hanoverian breeds, may have led to popular sire effects. [cite:13, cite:14]

Given the modern sport-oriented breeding programs, this study hypothesizes that selection pressure on genes relevant to athleticism and suitability for specific disciplines is reflected at the molecular genetic level. Similar to findings in cattle, where breeds with comparable phenotypes and goals show divergent selection signatures due to historical differences, we anticipated a similar phenomenon in sport horse breeds. The study of selective sweeps, where advantageous alleles and linked neutral alleles spread through a population due to selection, can provide insights into a population’s historical development and the genetic basis of phenotypic variation. [cite:16, cite:17] Various methods, including Runs of Homozygosity (ROH), integrated Haplotype Score (iHS), and cross-population Extended Haplotype Homozygosity (xpEHH), have been successfully applied to detect selection signatures in both humans and domesticated animals. [cite:18, cite:19, cite:20] ROH analysis has identified genomic regions and candidate genes under selection in horses, [cite:21, cite:22, cite:23] while iHS and xpEHH have been used to detect selection related to growth, feed efficiency, fat deposition, racing performance, and locomotion in various horse breeds. [cite:27, cite:28, cite:29, cite:30, cite:31] This study aims to identify genomic regions under positive selection within and across these warmblood horse breeds, explore whether breed histories are reflected in selection signatures, and identify candidate physiological processes and genes relevant to breeders’ interests.

Material and Methods

Sample Data Collection and Processing

The study included 942 stallions (Equus caballus) from four German warmblood breeds: Trakehner (N=44), Holsteiner (N=358), Oldenburger (N=221), and Hanoverian (N=319), born between 2002 and 2006. [cite:Table 1] Samples were collected during mandatory health examinations prior to stallion licensing, utilizing existing data without specific new sampling. Veterinarians performed blood sample collection as part of routine health and parentage checks. As these checks are legally mandated for stallion licensing, no specific ethical approval was required. Stallions included had passed an initial inspection for conformation and movement and met studbook pedigree requirements. Typically, these stallions were 2.5 to 3 years old, representing potential future sires and reflecting the respective breeding associations’ current goals. Stratification due to breeding lines for show-jumping or dressage was considered negligible. DNA was extracted from EDTA-stabilized blood samples and genotyped using the Illumina EquineSNP50 BeadChip. Quality control filters were applied in Illumina Genome Studio, setting Minor Allele Frequency (MAF) <0.01, call frequency <0.9, and p-value for Hardy-Weinberg Equilibrium (HWE) <0.00001. Following filtering, 48,410 SNPs on 31 autosomal chromosomes were retained for analysis. Allosomes were excluded due to the absence of Y chromosome data and their unsuitability for homozygosity-based analyses in males.

Data Processing and Statistical Analysis

Three methods were employed to detect selection signatures: ROH, iHS, and xpEHH. Prior to statistical analysis, haplotypes were derived, and missing genotype calls were imputed chromosome-wise across all breeds using Beagle 4.0, excluding pedigree information. Due to the high average call rate (>99.9%), the proportion of imputed genotypes was minimal (0.121%). Imputation was performed across all breeds together, accounting for the smaller sample size of the Trakehner breed.

Population structure was assessed using a principal component analysis (PCA) conducted with the Genome-wide Complex Trait Analysis (GCTA) software. [cite:34, cite:35] A genomic relationship matrix derived from genotype data was used to calculate the first 20 eigenvectors and all eigenvalues.

Runs of Homozygosity (ROH) Analysis

ROH and shared homozygous segments (ROH clusters) were analyzed chromosome-wise using the SNP & Variation Suite v.8.8.1. ROH clusters were defined as segments shared by at least one-third of the individuals within and across breeds. Criteria for ROH clusters included a minimum of 500kb and 15 SNPs, with no missing or heterozygous SNPs. The minimum SNP density was set at 1 SNP per 100kb, allowing for a maximum gap distance of 1,000kb.

Haplotype-Based Analyses (iHS and xpEHH)

The integrated Haplotype Score (iHS) method, developed by Voight et al. and building upon Extended Haplotype Homozygosity (EHH) by Sabeti et al., was used. iHS quantifies the difference in EHH between ancestral and derived alleles, with positive values indicating selection for the ancestral allele and negative values indicating selection for the derived allele. iHS calculations were performed per chromosome for individuals within and across breeds. Across-breed iHS analysis aimed to identify selection signals affecting the entire sport horse population.

The cross-population Extended Haplotype Homozygosity (xpEHH) method, described by Sabeti et al., was employed to detect selection signatures that have reached fixation in one population but remain polymorphic in another, offering high statistical power for detecting complete selective sweeps. xpEHH involves pairwise breed comparisons, where each breed was compared against the combined total of the other three.

Allele status (ancestral vs. derived) was determined using SNP data from domestic asses (Equus asinus) as an outgroup, aligned to the equine reference genome EquCab2.0. The donkey’s homozygous alleles were considered ancestral, and the alternative alleles were designated as derived. For SNPs where the donkey was heterozygous, the horse’s reference allele was assumed to be ancestral. SNPs with no clear ancestral/derived status were randomly assigned, as the study focused on the presence and location of selection events rather than their direction.

A total of 48,410 SNPs were included in the iHS and xpEHH analyses. Calculations were performed using the REHH 2.0.0 package in R Statistical Software. Linkage disequilibrium (LD) was evaluated using Haploview 4.2 with an r2 threshold of ≥0.8 on phased and imputed data, resulting in 7,739 tag SNPs across all autosomes. A conservative significance threshold of p = 0.0001 (-log10(p-value) = 4.0), accounting for approximately 10,000 independent tests, was used to address multiple testing.

Candidate Gene Screening

Regions exhibiting selection signatures were scanned for annotated genes in the equine reference assembly EquCab2.0 using Ensembl’s Biomart tool. [cite:Ensembl] For across-breed iHS signatures, a 1Mb window upstream and downstream of significant SNPs was considered. For ROH clusters, gene scanning was conservatively performed within the identified ROH stretches due to the lower positional resolution of the SNP chip.

Candidate genes were identified based on the following criteria: (A) overlap with known Quantitative Trait Loci (QTL), (B) functional relevance to the breeds’ selection goals, (C) identification through pathway enrichment analysis, and (D) existing literature reports.

Results

Principal Component Analysis (PCA)

The PCA based on genotype data revealed a tentative separation of the four warmblood breeds. The Trakehner cohort formed a distinct subgroup, adjacent to the Oldenburger and Hanoverian breeds, which largely overlapped. The Holsteiner breed clustered more separately from the other three. [cite:Fig 1] This clustering suggests genetic differentiation among the breeds, potentially reflecting their distinct histories and breeding policies.

Selection Signatures and QTL Overlap

Across-breed iHS and xpEHH selection signatures (within a 1Mb window) and ROH shared by at least one-third of all samples overlapped with 44 known equine QTL. [cite:Table 2] These 44 QTL pertain to 12 different traits. Notably, traits with over 10% of their listed QTL falling within selection signatures included cannon bone circumference, coat texture, hair density, and sperm count. This overlap suggests that selection has targeted genes influencing these specific traits in the studied warmblood breeds.

Runs of Homozygosity (ROH)

The analysis for ROH clusters identified 37 selection signals on 16 chromosomes across all breeds, with segments reaching up to 2,294,884 bp in length and shared by up to 43% of the stallions. [cite:Table 3] Breed-specific analyses revealed a significantly higher number of ROHs in Trakehner horses (149) compared to Holsteiner (58), Hanoverian (39), and Oldenburger (38). [cite:S1 Table] The high number of ROHs in Trakehners likely reflects their historical population bottleneck and potentially more intensive selection pressures.

Determination of Allele Status

Using domestic ass SNP data as an outgroup, ancestral and derived alleles were assigned for 46,747 of the 48,410 SNPs. [cite:Table 4] SNPs where the donkey was homozygous were classified as ancestral. SNPs with ambiguous status were randomly assigned, as the study focused on detecting selection events rather than the direction of selection.

Integrated Haplotype Score (iHS)

Across-breed iHS analysis identified significant selection signatures on chromosomes 1, 4, and 7. [cite:Fig 2, Table 4] Breed-specific analyses revealed distinct patterns: Hanoverian and Oldenburger shared similar signatures on chromosomes 1 and 4, while Trakehner showed signals on chromosomes 1, 4, 12, and 18, and Holsteiner on chromosome 17. [cite:Fig 3, S2 Table] These breed-specific signatures highlight divergent selection histories and focuses among the breeds.

Cross-population Extended Haplotype Homozygosity (xpEHH)

xpEHH analysis, comparing each breed against the others, identified significant selection signatures in Trakehner, Holsteiner, and Hanoverian breeds across multiple chromosomes. [cite:Fig 4, Table 5] The Oldenburger breed exhibited numerous significant signals on 12 chromosomes. Despite similarities in iHS patterns, direct comparisons between Hanoverian and Oldenburger using xpEHH revealed notable differences, underscoring unique selection pressures in each breed. [cite:Fig 5]

Enrichment Analysis

Enrichment analysis of genes within across-breed iHS signatures identified enriched Gene Ontology (GO) terms related to the nucleus, muscle function (myosins, motor activity), insulin-like growth factor (IGF) binding, and ATP binding. [cite:Table 6] Analysis of genes within ROH stretches highlighted pathways related to IGF binding, embryonic development, and skeletal system morphogenesis. [cite:Table 7] Combining genes from both iHS and ROH signatures revealed enriched annotation clusters related to embryonic development, cell growth, cell proliferation, differentiation, metabolism, and glycolytic processes. [cite:Table 8]

Discussion

This study investigated selection signatures in German warmblood horse breeds, which, despite sharing a common focus on athletic performance, possess divergent historical breeding policies and selection objectives. The analysis of 942 stallions, preselected for breeding potential, provides insights into the genetic underpinnings of traits relevant to modern sport horse development. [cite:Table 1]

The PCA results, showing Trakehner as a distinct cluster and Holsteiner separating from Hanoverian and Oldenburger, align with documented breeding histories. [cite:Fig 1] Trakehner and Holsteiner have historically maintained purer breeding lines, while Hanoverian and Oldenburger, despite shared breeding goals and occasional sire usage, exhibit distinct features detectable by xpEHH, suggesting unique historical influences. The high number of ROHs in Trakehners further supports their distinct genetic makeup, likely influenced by their severe population bottleneck. [cite:S1 Table] While ROH length can indicate the age of selection signatures, the thresholds used in this study limited detailed analysis of recent versus historical inbreeding. However, within-breed iHS analysis confirmed divergent haplotype patterns and distinct selection signatures in Trakehners compared to the other breeds, consistent with their unique history.

Candidate genes were identified through overlapping selection signatures with QTL, functional candidacy, enrichment analyses, and literature review. The shared selection signatures across breeds, particularly on chromosomes 1, 4, and 7, suggest common targets of selection related to core traits for sport horse performance. [cite:Table 4]

Genes related to coat color and texture were identified through ROH overlaps with QTL for hair density and coat texture on ECA 11. [cite:Table 2, Table 3] The enrichment of the GO term “intermediate filament” was driven by the keratin complex, crucial for hair and hoof quality. [cite:49, cite:50, cite:51] Variants in KRT25 have been linked to curly hair phenotypes. Additionally, KITLG, implicated in skin pigmentation, was found within a ROH stretch, consistent with its role in coat color variation, a trait historically subject to selection in horses. [cite:53, cite:54, cite:26] Genes like KIT, associated with coat color phenotypes, were also found near ROH signatures. [cite:55, cite:56, cite:57, cite:58, cite:59]

Growth and body size are critical traits shaped by artificial selection. QTL for height and body weight overlapped with selection signatures across multiple chromosomes. [cite:Table 2, cite:47, cite:62, cite:63] Enrichment analysis highlighted pathways related to IGF binding and embryonic skeletal system morphogenesis, suggesting selection on genes influencing growth regulation. Candidate genes IGFBP1, 3, and 4 (involved in IGF binding) and the HOXB gene cluster (crucial for embryonic patterning) are proposed to be under selection pressure for growth and development. [cite:64, cite:65, cite:66] Genes like BMP2, involved in bone development, were also identified within ROH stretches. [cite:67, cite:68]

Muscle functionality and metabolism are paramount for athleticism. Enrichment analyses pointed to pathways involving motor activity, myosin complex, and tropomyosin binding. [cite:Table 6] Candidate genes like TPM1, TMOD2 & 3, MYO5A, and MYO5C are key components of muscle contractile units and likely targets of selection for improved performance. [cite:79, cite:80, cite:81] Genes such as IGFBP4 and AEBP1 may play dual roles in metabolism and muscle function, with AEBP1 potentially influencing energy homeostasis and cardiac function, and IGFBP4 involved in WNT signaling crucial for cardiogenesis. [cite:72, cite:73, cite:74] Furthermore, the RALGAPA2 gene, associated with racing performance, was located within a ROH stretch, suggesting its role in overall athleticism.

Fertility, influenced by natural selection, was also explored. While few direct overlaps with fertility-related QTL were found, candidate genes involved in male fertility, such as ZPBP1 & 2, SUN3, THEGL, TEX14, and CFAP61, were identified within ROH and iHS selection signatures. [cite:86, cite:87, cite:88, cite:89, cite:90, cite:26] These genes play crucial roles in sperm formation and development, supporting the hypothesis that fertility has remained a selection target, albeit potentially under different pressures than performance traits.

Conclusion

This study identified selection signatures in German warmblood horses, revealing both shared genomic regions across breeds and distinct patterns reflecting their divergent historical breeding. Candidate genes implicated in development, growth, metabolism, muscle function, and fertility are proposed as targets of past and present selection. Further research integrating comprehensive phenotyping with genomic data is recommended to validate these findings and their causal relationship with key traits in sport horse breeding.

Supporting Information

S1 Table. Breed specific Runs of Homozygosity (ROH) in Holsteiner, Hanoverian, Oldenburger and Trakehner.

ROH were shared by at least 33 percent of all individuals (N = 942) in the sample set. [cite:S1 Table]

S2 Table. Breed specific significant integrated Haplotype Score (iHS) signals (-log10(p-value) ≥ 4.0) in Trakehner, Holsteiner, Hanoverian and Oldenburger. [cite:S2 Table]

Acknowledgments

The authors express gratitude to the breeding associations of the Trakehner, Holsteiner, Hanoverian, and Oldenburger horse for providing blood samples and to Philip Widmann for his technical assistance.

References

  1. Selection and breeding practices have shaped livestock genomes since domestication.
  2. The four German warmblood breeds are significant in international sport horse rankings.
  3. Trakehner breeding goals and history.
  4. Trakehner breeding program and utilization.
  5. Hanoverian breeding history and refinement.
  6. Oldenburger breed’s historical focus on carriage driving.
  7. Oldenburger studbook’s shift towards lighter riding horses.
  8. Historical breeding practices of the Oldenburger studbook.
  9. Oldenburger breeding goals for sport horses.
  10. Hanoverian studbook selection criteria.
  11. Oldenburg International studbook focus on show-jumping.
  12. Holsteiner breeding history and refinement for show-jumping.
  13. Popular sire effects in Holsteiner and Hanoverian breeds.
  14. Influence of popular sires in 20th-century horse breeding.
  15. Divergent selection signatures in cattle breeds with similar phenotypes.
  16. Detecting selective sweeps using haplotype-based methods.
  17. Utilizing selective sweeps to understand phenotypic variation.
  18. xpEHH for detecting selection signatures between populations.
  19. iHS for detecting selection within populations.
  20. Application of selection signature detection in domesticated animals.
  21. ROH for identifying selected genomic regions in domestic animals.
  22. ROH analysis in livestock genetics.
  23. Candidate genes under selection identified via ROH.
  24. ROH for assessing breed history in Haflinger horses.
  25. Breed development and ROH analysis in horses.
  26. Selection signatures in horses, including candidate genes for fertility and coat color.
  27. Selection signatures in Asian ponies.
  28. Genomic selection signatures in Shetland ponies.
  29. Selection for performance traits in gaited breeds and Quarter Horses.
  30. Selection signatures in Thoroughbred horses related to racing.
  31. Selection for athletic performance in Quarter Horses.
  32. Genomic prediction of breed assignment in warmblood horses.
  33. Beagle 4.0 for genotype imputation.
  34. GCTA for Genome-wide Complex Trait Analysis.
  35. Using GCTA for PCA and genomic relationship matrices.
  36. SNP & Variation Suite for ROH analysis.
  37. Criteria for defining Runs of Homozygosity.
  38. EHH for detecting selection.
  39. Equine reference genome EquCab2.0.
  40. Outgroup usage in selection signature studies.
  41. REHH 2.0.0 package for R.
  42. Haploview for linkage disequilibrium analysis.
  43. Candidate genes for muscle functionality.
  44. Candidate genes for energy metabolism and growth.
  45. Candidate genes for embryonic development.
  46. Candidate genes for fertility.
  47. QTL for height in horses.
  48. Common sire usage in Hanoverian and Oldenburger breeds.
  49. Keratin’s role in skin quality.
  50. Keratin’s role in hair quality.
  51. Keratin as a major component of equine hoof.
  52. KRT25 variant associated with curly hair phenotype in horses.
  53. KITLG’s effect on pigmentation in cattle and pigs.
  54. KITLG’s role in coat color variation.
  55. QTL for white markings in horses.
  56. KIT gene and dominant white syndrome in horses.
  57. KIT gene and coat color phenotypes.
  58. Historical selection for coat colors in horses.
  59. Coat color selection continues to be relevant in horse breeding.
  60. Artificial selection for size in domestic animals.
  61. Heritability of wither height in horses.
  62. QTL related to height on ECA 3 and 8.
  63. QTL for body weight in horses.
  64. IGFBP4 associated with height in humans.
  65. IGFs and their role in early childhood growth.
  66. HOXB genes in embryonic development and patterning.
  67. BMP2 and body size in sheep.
  68. BMP2 and body size in goats.
  69. Metabolism and muscle functionality in athletic performance.
  70. Physiological factors influencing athleticism.
  71. IGF binding proteins in metabolism and diabetes.
  72. AEBP1’s role in diet-induced obesity and energy homeostasis.
  73. AEBP1 expression in smooth muscle cells.
  74. IGFBP4’s role in WNT signaling and cardiogenesis.
  75. QTL for racing ability on ECA 17.
  76. QTL for racing ability on ECA 18.
  77. QTL for racing ability on ECA 28.
  78. RALGAPA2 associated with racing performance.
  79. Actin and myosin in sarcomeres.
  80. Tropomyosin and tropomodulin in actin filament stabilization.
  81. TPM1 isoforms and muscle performance.
  82. Sperm quality and pregnancy rates in mares.
  83. Heritability of fertility in horses.
  84. Low heritability of fertility traits.
  85. QTL for sperm count in horses.
  86. SUN3’s role in sperm head formation.
  87. ZPBP1 & 2’s role in acrosome formation and sperm development.
  88. ZPBP1 mutations in infertile men.
  89. THEGL expression in mouse testis.
  90. TEX14’s role in spermatogenesis.

Leave a Reply

Your email address will not be published. Required fields are marked *