Analysis of recently identified dyslipidemia alleles reveals two loci that contribute to risk for carotid artery disease

Background Genome-wide association studies have identified numerous single nucleotide polymorphisms (SNPs) affecting high density lipoprotein (HDL) or low density lipoprotein (LDL) cholesterol levels; these SNPs may contribute to the genetic basis of vascular diseases. Results We assessed the impact of 34 SNPs at 23 loci on dyslipidemia, key lipid sub-phenotypes, and severe carotid artery disease (CAAD) in a case-control cohort. The effects of these SNPs on HDL and LDL were consistent with those previously reported, and we provide unbiased estimates of the percent variance in HDL (3.9%) and LDL (3.3%) explained by genetic risk scores. We assessed the effects of these SNPs on HDL subfractions, apolipoprotein A-1, LDL buoyancy, apolipoprotein B, and lipoprotein (a) and found that rs646776 predicts apolipoprotein B level while rs2075650 predicts LDL buoyancy. Finally, we tested the role of these SNPs in conferring risk for ultrasonographically documented CAAD stenosis status. We found that two loci, chromosome 1p13.3 near CELSR2 and PSRC1 which contains rs646776, and 19q13.2 near TOMM40 and APOE which contains rs2075650, harbor risk alleles for CAAD. Conclusion Our analysis of 34 SNPs contributing to dyslipidemia at 23 loci suggests that genetic variation in the 1p13.3 region may increase risk of CAAD by increasing LDL particle number, whereas variation in the 19q13.2 region may increase CAAD risk by promoting formation of smaller, denser LDL particles.


Background
Carotid artery disease (CAAD) is an important risk factor for stroke, the third leading cause of death in the U.S. Given the high mortality, morbidity, and economic costs due to stroke, primary prevention, particularly targeted toward high risk individuals, is the most promising approach to combat stroke [1,2]. Although medical interventions and carotid endarterectomy can potentially prevent strokes in individuals with CAAD, routine screening is not currently recommended [1]. However, it has been suggested that if high risk groups with CAAD prevalences of approximately 20% can be identified, screening may provide significant and cost effective extension to quality adjusted life years [3,4]. Studies of siblings [5,6], twins [7,8], and families [9] suggest a heritable genetic contribution to carotid artery intima-media thickening and stenosis from plaque, with the heritability of ultrasonographically measured phenotypes typically ranging from 20% to 40% in population based samples [10]. Thus, identification of genetic risk factors for carotid artery stenosis, progression, and plaque instability may ultimately be useful in targeting primary prevention against stroke in patients for whom management strategies are not yet well defined.
Recently a number of large genome-wide association studies have revealed loci affecting total cholesterol, high-density lipoprotein cholesterol (HDL), low-density lipoprotein cholesterol (LDL), and triglycerides [11][12][13][14][15]. Because of their role in promoting dyslipidemia, these single nucleotide polymorphisms (SNPs) are strong candidates for contributing to genetic risk for atherosclerosis, and several studies have found significant impacts of these loci on coronary artery disease [11,12,16]. Although many clinical risk factors such as age, smoking, hypertension, and diabetes are shared between CAAD and coronary artery disease, the relative importance of these risk factors differs between these two vascular disease processes [17]. Similarly, the relative importance of risk factors varies for disease at different locations within the carotid arteries themselves [9,10]. These discrepancies suggest that additional factors, including genetic ones, may modulate the atherosclerotic disease process differently in different anatomic locations. Thus, the impact of recently discovered dyslipidemia risk alleles on CAAD is as yet unknown.
Based on previous success in applying genetic risk scores for decreased HDL and increased LDL to the prediction of coronary artery disease [12] and the central role of these lipid fractions in evidence-based guidelines for coronary artery disease risk reduction [18], we investigated the role of SNPs affecting HDL and LDL in predicting risk for CAAD. We also sought to determine whether these SNPs alter key lipid sub-phenotypes with differential atherogenic potential. Specifically, the more efficient cholesterol efflux activity of apolipoprotein A-I (apo A-I) [19] has lead to the hypothesis that the HDL 2 subfraction or apo A-I may be a better predictor of protection against atherosclerosis than HDL 3 or total HDL. We also tested SNPs for their effects on apolipoprotein B (apo B), which measures LDL particle number and may be a better estimator of cardiovascular disease risk than LDL level [20,21], LDL buoyancy which predicts the smaller, denser LDL pattern B phenotype [22] associated with increased risk of coronary artery disease [23], and lipoprotein(a) (Lp(a)) which appears to independently predict risk of coronary artery disease [24] and stroke [25]. These analyses may suggest mechanisms through which specific SNPs modulate CAAD risk beyond their effects on HDL and LDL levels.

Effects of SNPs on HDL and LDL
As shown in Figure 1 our data confirm the previously reported effects on HDL and LDL for the majority of SNPs tested in CLEAR study participants (see Table 1). Out of 34 SNPs tested (see Table 2), we identified 14 SNPs that showed nominally significant associations with HDL or LDL levels at a p-value of 0.05, corresponding to a FDR of 0.11 when corrected for multiple testing. These 14 SNPs correspond to those for which the 95% confidence intervals do not cross zero in Figure 1. For only three SNPs the 95% confidence intervals do not contain the previously reported effect from the literature, and for the top 25 SNPs, those indicated by closed circles, our data are more likely assuming the effect as reported in the literature (i.e. 's as given by the x's in Figure 1) than under the null ( = 0). Furthermore, for 28 out of 34 SNPs the effects on ββ   [13], and e Aulchenko et al [11]. f Only SNPs with estimated imputation accuracies of >90% were included in downstream analyses. g Excluded from downstream analyses due to r 2 = 1.00 with rs646776. h Excluded due to r 2 = 0.99 with rs12654264. i Excluded due to r 2 = 0.93 and 0.95 with rs328. j Excluded due to r 2 = 0.96 with rs3890182.
HDL or LDL in our data were in the same direction as reported previously in the literature, corresponding to significant concordance under a binomial test (p = 2.0 × 10 -4 ).
Given the small marginal effects of the recently reported SNPs on HDL or LDL levels, we used a genetic risk score combining alleles additively across SNPs to better predict lipid levels. Such genetic risk scores have previously been utilized, but they are subject to upward bias when developed in the same sample used to initially detect associations. Using a risk score with SNPs weighted by the reported effect sizes from the literature (see Table 2) and after accounting for covariates, the 16 HDL SNPs explain 3.9% of the variation in HDL levels (p = 4.3 × 10 -9 ). Using the same approach, the 18 LDL SNPs explain 3.3% of the variation in LDL levels (p = 4.4 × 10 -8 ).

Effects of SNPs on lipid sub-phenotypes
With few exceptions we found that the same SNPs significantly associated with total HDL in our data were also associated with HDL 2 , HDL 3 , and apo A-I. At a p-value of 0.05, corresponding to an FDR of 0.11, we identified 19 significant associations between the 16 HDL SNPs and HDL 2 , HDL 3 , or apo A-I. In a principal component analysis of the t-statistics derived from tests of association between the 16 HDL SNPs and 4 phenotypes (total HDL, HDL 2 , HDL 3 , and apo A-I), we found that the first principal component was positively correlated with all 4 phenotypes and explained 94% of the variance in the t-statistics, indicating that the effects of each SNP were highly concordant across all HDL related phenotypes. Similarly, the hierarchical clustering analysis performed by Kathiresan et al [14] also shows that the current set of genetic loci affecting HDL does little to discriminate among HDL 2 , HDL 3 , and apo A-I levels.
At a p-value of 0.05, we found that rs646776, rs693, rs2228671, and rs6511720 were associated with apo B and that rs2075650 and rs2650000 were associated with LDL buoyancy. The associated FDR for these tests was quite high at 0.35, due in part to the absence of any affect of these SNPs on Lp(a). Among SNPs associated with apo B, rs646776 was by far the most statistically significant (p = 0.00035, compared with 0.017 ≤ p < 0.05 for the remaining three) with its minor allele decreasing apo B ( = -3.3). Kathiresan et al [14] found a similarly strong effect on apo B of the nearby SNP rs12740374 ( = -3.3; p = 1.2 × 10 -8 ), which is nearly perfectly correlated with rs646776 (see Table 2). Among SNPs associated with LDL buoyancy, rs2075650 was the most statistically significant (p = 0.014), with the minor allele decreasing the relative flotation rate ( = -0.0047) leading to a more atherogenic phenotype.

Effects of SNPs on CAAD risk
After a Bonferroni correction for 34 tests, corresponding to a threshold of p = 0.0015, Figure 2 shows that two SNPs, rs646776 and rs2075650, met criteria for significant association with CAAD. For convenience SNPs have been recoded from minor allele dose to unfavorable allele dose based on their effect on HDL or LDL so that the expected effect on CAAD is >0. Due to the prior expectation that alleles that decrease HDL or increase LDL would confer increased risk for CAAD, we performed onesided tests. The major allele of rs646776 is associated with increased LDL, increased apo B as described above, and The associations between rs646776 and rs2075650 and CAAD could not be explained by their effects on HDL and LDL levels alone. We found that including the observed HDL and LDL levels, use of lipid lowering therapy, or "pre-therapy" HDL and LDL levels (see Methods) as covariates did not dramatically change the significance or magnitude of the associations between CAAD and rs646776 or rs2075650 (data not shown). Combined with their effects on LDL particle number and buoyancy, these results raise the possibility that rs646776 and rs2075650 contribute additional information to current HDL and LDL levels alone for CAAD risk prediction.

Genetic risk score for CAAD
Contrary to HDL and LDL, we did not find that CAAD risk prediction was improved using an additive risk score based on these SNPs. We tested both a risk score in which all SNPs were weighted equally as in Kathiresan et al [12] (p = 0.32), and one in which SNPs were weighted by the magnitude of their effects on HDL and LDL as in Aulchenko et al [11] (p = 0.63). Results were similar with or without HDL, LDL, and use of lipid lowering as covariates. To clarify this finding, we sought to determine our power to detect association between the genetic risk scorê ββββ β β and CAAD, given that the risk score based on these SNPs explains only 3.9% and 3.3% of the variance in HDL and LDL levels, respectively, and given that dyslipidemia is only one of several risk factors for CAAD. Using power simulations (see Methods), we found that the genetic risk score, acting through its effects on HDL and LDL levels alone, had an expected odds ratio for association with CAAD of 1.03 to 1.04 per unfavorable allele. The power based on these effect sizes was only 20% to 39%, suggesting that we cannot reject a model in which the overall genetic score confers risk of CAAD, but mainly through its effect on lipid levels. However, if the odds ratio per unfavorable allele averaged as little as 1.07 to 1.10 our power to detect association between the risk score and CAAD was 81% to 97%. This suggests that if a significant fraction of these 34 SNPs are predictive of CAAD risk beyond their effects on measured HDL and LDL levels, we would have had high power to detect such an association.

CAAD risk locus on chromosome 1p13.3
To further explore the CAAD association in the vicinity of rs646776 we analyzed 82 additional SNPs that were genotyped or imputed with at least 80% accuracy, and Figure  3 shows that rs646776 and neighboring SNPs in CELSR2 and PSRC1 exhibit the greatest statistical significance. To correct for multiple testing in the setting of strong linkage disequilibrium we performed permutation testing to estimate a significance threshold of p = 0.0040. Within SORT1 the intronic SNP rs4970843 shows significant association (p = 0.0030). Interestingly rs4970843 is in weak but significant long range linkage disequilibrium with rs646776 (r 2 = 0.083; p = 2.5 × 10 -19 ), and is in relatively weaker linkage disequilibrium with nearby SNPs in the highly correlated block encompassing SORT1.  Figure 4 shows that rs2075650, an intronic SNP in TOMM40, displays stronger association with CAAD than rs429358, which defines the ε3/ε4 dichotomy, or rs7412, which defines the ε2/ε3 dichotomy, in the APOE ε2/ε3/ε4 polymorphism. When rs2075650 and rs429358 were analyzed conditional on one another, we found significant association with rs2075650 (p = 0.032) but not with rs429358 (p = 0.61). This did not appear to be due to the additive model being a poor fit for the effect of APOE, because when we utilized a two degree of freedom model for the ε3/ε4 effect, rs2075650 retained significance (p = 0.031) whereas neither the recessive nor dominance term for rs429358 was significant (p = 0.78 and p = 0.72, respectively). When rs2075650 and rs7412 were analyzed conditional on one another we found that both showed significant association with CAAD (p = 0.0047 for rs2075650 and p = 0.012 for rs7412). With a two degree of freedom model, rs2075650 retained significance (p = 0.0046) while the dominance term (p = 0.021) but not the recessive term (p = 0.40) was significant for rs7412. When all three SNPs were analyzed simultaneously including an interaction term for ε2/ε3 with ε3/ε4, rs2075650 was suggestively significant (p = 0.055), neither rs429358 nor its interaction with rs7412 was significant (p = 0.63 and p = 1.00, respectively), and rs7412 was significant (p = 0.036).

Discussion
Our data confirm the effects of recently identified dyslipidemia SNPs, and we estimated that, after accounting for other covariates, genetic risk scores explain 3.9% and 3.3% of the variance in HDL and LDL, respectively. Although slightly different sets of SNPs and weighting factors were used, these values are in reasonable agreement with those of 4.8% and 3.4% as reported by Aulchenko et al [11]. Our estimates may be biased lower because of the extensive use of lipid lowering therapy in CLEAR partici- pants with various pharmacological agents, dosing, and medication compliance which we have not attempted to fully model. In addition, while Aulchenko et al. sought to reduce upward bias by estimating weights for each SNP in independent cohorts from that in which the risk score was applied, the fact that the set of SNPs used in the risk score was apparently identified using all available cohorts suggests that a slight upward bias would still remain in their estimates. Based on the estimated effect sizes of this panel of SNPs on HDL and LDL, and the estimated effect size of dyslipidemia on CAAD, our data cannot exclude a model in which these SNPs as a group increase risk of CAAD mostly or entirely through their effects on lipids. However, our failure to identify a significant association between an overall risk score based on these SNPs and CAAD suggests that the majority of these SNPs are unlikely to contribute strongly to CAAD beyond their role in promoting dyslipidemia.  an intermediate phenotype is developed using tens or hundreds of SNPs, it is likely that only a subset of the contributing SNPs will have these favorable predictive properties. Within the context of our study, we found no significant predictive power of the overall dyslipidemia risk score beyond what would be predicted by the role of dyslipidemia in CAAD alone, yet we identify two SNPs, rs646776 and rs2075650, which appear to mediate risk for CAAD beyond their effects on HDL and LDL. Based on its association with apo B, our analyses indicate that rs646776 affects LDL particle number in addition to LDL level, which may account for the additional explanatory power of this SNP. In contrast, rs2075650 appears to affect LDL buoyancy, with the minor allele contributing to the smaller, denser LDL particles that make up the more atherogenic LDL pattern B phenotype.

Effects of SNPs on CAAD risk
Based on our analyses of the 1p13.3 region containing rs646776, variation associated with CAAD is most likely located near CELSR2, a non-classical cadherin that does not interact with catenins, or PSRC1, a p53-regulated growth receptor. Although SORT1, a multi-ligand receptor present in the Golgi and on the cell surface, represents a good candidate gene because it binds and mediates degradation of lipoprotein lipase [26], increases its localization to the plasma membrane of adipocytes in response to insulin, and forms GLUT4 storage vesicles which enhance insulin sensitivity [27], our association signal was weaker in this gene. Given the strong linkage disequilibrium according to HapMap data, it is unlikely that there exists common variation in SORT1 that was not captured by our study. Although the 1p13.3 region appears to show robust association with coronary artery disease [28][29][30], a recent study that included 33,282 participants with a total of 503 strokes at baseline and 571 incident strokes did not identify a significant association between stroke and either rs599839 or rs4970834 in this region [31]. However, stroke is the sequela of a diverse set of underlying patho-Linkage disequilibrium structure and association results for the chromosome 19p13.2 region containing rs2075650 physiologic causes which do not appear to have been distinguished in this study [32], so it is unclear whether this sample size would have sufficient power to detect an affect of the 1p13.3 region on the subset of strokes due to CAAD.
In the 19p13.2 region APOE is a stronger candidate gene than TOMM40, a channel forming subunit that is essential for protein import into the mitochondria [33]. The ε2/ ε3/ε4 polymorphism is reported to be a equivocal risk factor for carotid atherosclerosis [10] although a more recent meta-analysis does support a modest association between these SNPs and carotid intima media thickness [34]. Consistent with this, our data show nominally significant protective effects of the ε2 allele and deleterious effects of the ε4 allele. However, conditioning on these SNPs did not account for the CAAD association signal in the region, and moreover the ε4 allele failed to demonstrate significant association with CAAD when rs2075650 was jointly considered. This result argues against a singular causal role for the APOE ε system in producing the CAAD association signal in this region, because one would expect the causal polymorphism to achieve greater statistical significance than, and in fact eliminate signal from, surrounding neutral variation. Instead, our data are most consistent with a causal polymorphism or collection of polymorphisms in linkage disequilibrium with both rs2075650 and the APOE ε system.
In summary, our data replicate the majority of associations reported between SNPs and HDL and LDL. Unbiased or slightly negatively biased estimates of the proportion of variance in HDL and LDL levels explained by these SNPs are 3.9% and 3.3% respectively, consistent with previous estimates [11]. The combined set of SNPs currently available does not improve CAAD risk prediction beyond what would be expected from their effects on HDL and LDL levels, but the specific SNPs rs646776 and rs2075650 are associated with CAAD risk, possibly due to their effects on LDL particle number and buoyancy, respectively.

Clear study participants
The Carotid Lesion Epidemiology And Risk (CLEAR) Study is a Seattle-based study involving the University of Washington (UW), Virginia Mason Medical Center (VM) and the Veterans Affairs Puget Sound Health Care System (VAPSHCS), focused on identifying predictors of CAAD, CAAD progression, and atherosclerotic plaque instability approved by the UW, VM, and VAPSHCS IRBs. All participants gave written informed consent. Participant characteristics are shown in Table 1. Only Caucasian males were analyzed due to under-representation of women and minorities in the cohort. Self reported ancestry was con-firmed by STRUCTURE [35]. Individuals with total serum cholesterol >400 mg/dL or coagulopathy were excluded. Controls include 479 individuals with ≤15% carotid stenosis bilaterally as measured by duplex ultrasound. Individuals with vascular disease at other sites were excluded from the set of controls. Cases include 353 individuals status post carotid endarterectomy for symptomatic disease or asymptomatic individuals with ≥80% internal carotid stenosis either unilaterally or bilaterally. Individuals with intermediate stenosis have 50% to 79% luminal narrowing either unilaterally or bilaterally. Cases and controls were matched on age distribution, with censoring occurring at the time of diagnosis of vascular disease for cases or at the time of the last blood draw for controls. Hypertension was defined by treatment with antihypertensive medications. Diabetes was defined as a hemoglobin A1C≥6.5 or use of oral hypoglycemics or insulin.

Lipid phenotypes
Standard methods were used to determine total cholesterol, triglycerides, and HDL in fasting whole plasma using an Abbott Spectrum analyzer. LDL was calculated unless triglycerides were ≥400 mg/dL, in which case it was measured directly. HDL fractions 2 and 3 were determined by precipitating HDL 2 from total HDL, measuring HDL 3 in the supernatant, and subtracting this from total HDL to obtain HDL 2 . Apolipoprotein A-I, apolipoprotein B, and lipoprotein(a) were measured as described by Marcovina et al [36], Zambon et al [37], and Marcovina et al [38], respectively. LDL buoyancy was measured by the relative flotation rate Rf as described by Capell et al [22]. We utilized lipid measurements prior to initiation of lipid lowering therapy whenever possible. For 90 individuals with two to three repeated lipid measurements we used the mean of these measurements. Based on inspection of the raw phenotype and residuals distributions, we excluded 6 outlying individuals with HDL>100 mg/dL, 10 individuals with HDL 2 >25 mg/dL, 4 individuals with HDL 3 >80, and 4 individuals with apo A-I>225 mg/dL. We also excluded 4 outlying individuals with LDL>200 mg/ dL and 4 individuals with LDL fraction apolipoprotein B>120 mg/dL. The positively skewed lipoprotein (a) distribution was log transformed.

Genotyping and SNPs
Genotypes were measured using the Illumina HumanCVD Genotyping BeadChip using an Illumina BeadStation Laboratory System platform [39,40]. Duplicate genotyping for 34 individuals showed 99.7% consistency in calls. The APOE ε2/ε3/ε4 polymorphism was genotyped as previously described [41]. Additional SNPs in the chromosome 1p13.3 and 19p13.2 regions were genotyped using TaqMan Assays by Design on an Applied Biosystems 7900HT System [42]. Using unphased reference genotypes from release 27 of the HapMap project [43]. we performed imputation for untyped SNPs using BIMBAM [44,45]. Although inaccurate genotype imputation is expected to cause false negatives rather than false positives, to avoid spurious conclusions we sought to determine which HapMap SNPs could be accurately imputed relying on the SNPs genotyped in the CLEAR study. We selected a random set of 10 individuals from the HapMap CEU sample and set to missing those SNPs not genotyped in the CLEAR study. We then imputed these missing SNPs using the remaining SNPs that had been genotyped in the CLEAR study. For each imputed SNP we computed the correlation between the imputed mean genotypes and the true genotypes for those HapMap individuals in whom genotypes had been masked. We repeated this procedure 20 times, selecting a different set of 10 individuals for genotype masking each time, and we report the imputation accuracy as the mean correlation over these 20 iterations. Only SNPs with >90% imputation accuracy were included in downstream analyses.

Statistical analyses
All analyses were performed in R [46]. Unless otherwise specified, tests for genetic association were performed assuming an additive model with the homozygous genotypes coded as 0 or 2 and the heterozygous genotype coded as 1. Analyses of lipid phenotypes were performed in cases, individuals with intermediate stenosis, and controls using linear regression with censored age, body mass index (BMI), hypertension, diabetes, and use of lipid lowering therapy as covariates. Current cigarette usage was included as a covariate for analyses of HDL. Unless otherwise specified analyses of CAAD were performed using logistic regression with case status (≥80% stenosis) coded as 1 and control status (≤15% stenosis) coded as 0 and censored age, current cigarette usage, pack-years smoked, BMI, hypertension, and diabetes as covariates. To account for multiple testing we estimated false discovery rates (FDR) using the Benjamini-Hochberg procedure [47].

Analysis of CAAD risk in the setting of lipid lowering therapy
Consistent with guidelines [18], cases with CAAD in the CLEAR study are generally treated with lipid lowering therapy to a target of LDL<100 mg/dL, whereas controls, who are without CAAD, coronary artery disease, or risk equivalents, are managed with a target LDL of <130 mg/ dL or <160 mg/dL. Thus, including HDL, LDL, or lipid lowering therapy as covariates is problematic because it leads to a model in which the dependent variable, CAAD case control status, is causal for these independent variables. However, in order to study the effects of SNPs on CAAD in the context of lipid risk factors, we attempted to estimate "pre-therapy" HDL and LDL values for those individuals on lipid lowering therapy. We based on these values on the lipid altering effects of statins, the drug class for 92% of all lipid draws when therapy was in use. For HDL, we estimated that the post-therapy values were 2.1% to 9.6% higher than pre-therapy [48], and for LDL 30% to 63% lower than pre-therapy [48,49]. We also estimated the percent change in HDL (13% increase) and LDL (28% decrease) from 41 individuals in the CLEAR study who had measurements both prior to and following initiation of lipid lowering therapy.

Power to detect association between genetic risk scores and CAAD
We performed simulations to determine our power to detect association between the genetic risk score and CAAD, given the percent variation in HDL and LDL levels explained by the genetic risk score and given the effect size of dyslipidemia as a risk factor for CAAD. Using the estimated "pre-therapy" HDL and LDL values described above, we first permuted the observed genotypes so that an additive score for the 16 HDL SNPs explained on average 3.9% of the variance in HDL levels and so that an additive score for the 18 LDL SNPs explained on average 3.3% of the variance in LDL levels. We then combined the HDL and LDL scores to form the overall genetic risk score. Next we simulated CAAD status by sampling binomial random variables with underlying probabilities given by the fitted effects of a logistic regression model that included "pre-therapy" HDL and LDL levels as well as all other covariates. For the three different "pre-therapy" HDL and LDL estimates described above, the odds ratios for association with CAAD in this model ranged from 0.95 to 0.96 per mg/dL change in HDL and 1.010 to 1.011 per mg/dL change in LDL. Finally, we tested for association between the simulated genetic risk score and simulated CAAD status in the setting of the usual covariates without HDL or LDL levels in the model. To estimate the expected odds ratio for the genetic risk score and the power, we performed 1000 such simulations for each of the three different estimates of "pre-therapy" HDL and LDL levels.