Overlapping Haplotype Effects at One Locus

By: Thiago Sanches and Oksana Polesskaya

September 2025

In one dataset, we observed a “double banding” pattern with an unusual r² layout on LocusZoom. It was odd that the SNPs that had similar r² with the top SNP had different -log10p values (Fig 1). 

Figure 1. The association plot for the Progressive Ratio trait.

We first suspected that this pattern is related to the change in the data pre-processing (see below). The Progressive Ratio trait is not normally distributed; quantile normalization can introduce edge effects. For this trait, females had a higher mean than males (Fig 4). However, there was no association signal for sex in this region, ruling out this concern.

Closer inspection revealed that there are two different haplotypes associated with this trait (Fig 2).

Figure 2. With r² on a continuous scale, the two “parallel” bands show different correlations with the lead SNP, indicating two haplotypes.

Subsequent analysis traced these haplotypes to specific founders (Fig.3). With r² on a continuous scale, the two “parallel” bands show different correlations with the lead SNP, indicating two haplotypes.

Figure 3. The two haplotypes that are associated with this trait originate from the founders BUF|M520 (purple) and WKY (brown). Haplotype calls were estimated as the most common haplotype within a defined LD block. LD blocks were called using HDBSCAN of the precomputed R2 correlation of all the snps between 61M and 62M. Unassigned SNPs appear in black

As we develop haplotype-wide association (HWAS) methods, founder-specific haplotype effects should become even clearer.

Data pre-processing notes

Until 2025: We quantile-normalized within sex, combined the data, regressed covariates explaining >2% of variance, then quantile-normalized again (PMID: 32860487). Ties were broken randomly but reproducibly.

Current practice: We brought pre-processing in-line with the common practice used in human and model organisms GWAS. We now use sex as a covariate, along with other covariates explaining more than 2% of variance. 

Fig.4 shows the distribution of Pr2 data that is discussed in this blog entry, as raw data, and as preprocessed using “old”  and current method.

Figure 4. Preprocessing of Progressive Ratio data and illustration of edge effect.
A. Distribution of the raw data. Note that females press more than males.
B. Matrix comparison of the raw data distribution and processed data distribution. Note that in the processed data, unlike in the raw data, the lower-performing individuals are always females, and the higher-performing individuals are always males.