1 Participants

  • Okezie Uche-Ikonne, Department of Mathematics and Statistics, Lancaster University, Lancaster, UK,
  • Michael Holmes, Medical Research Council Population Health Research Unit at the University of Oxford, Oxford, UK.
  • Frank Dondelinger, Faculty of Health and Medicine, Lancaster University, Lancaster, UK.
  • Tom Palmer, Department of Mathematics and Statistics, Lancaster University, Lancaster, UK.

2 Motivation

There has been considerable research on the role of blood lipids and their associations with various cardiovascular traits (Holmes and Davey Smith 2018). While observational analyses have led to naïve classifications of “good” (higher density lipoprotein, HDL) and “bad” (lower density lipoprotein, LDL) blood lipids, the underlying causal relationships suggest that while LDL and triglycerides may have atherogenic characteristics, HDL-cholesterol is unlikely to play an important role in atherogenesis.

The MRDataChallenge provides a summary level dataset by which contains the associations of genotypes (comprising 148 SNPs) with lipid traits and the associations of genotypes with 7 outcomes (W. Spiller, Bowden, and Zuber 2019). Of the seven outcomes, we selected ischemic stroke to investigate the casual relationship of LDL and HDL lipid traits using the Mendelian randomization (MR) approach (Davey Smith and Ibrahim 2003). Figures 2.1 and 2.2 represent the DAGs for the proposed analysis.

Figure 2.1: Directed acyclic graph (DAG) of the MR analysis to investigate the effect of LDL to ischemic stroke.

Figure 2.2: DAG representing the MR analysis for the effect of HDL on ischemic stroke.

One cause of ischaemic stroke is the development of atherosclerosis, in which build-up of fatty deposits in the arterial wall leads to development of a plaque, which can disrupt the supply of oxygenated blood flow to the brain. There is strong evidence that LDL-related lipid phenotypes are causally implicated in the aetiology of atherosclerosis and coronary heart disease.

In a recent comment piece, Holmes and Ala-Korpela (2019) discussed that there is a size dependent threshold whereby lipid species that are bigger than small VLDLs, are too large to enter the arterial intima, as shown in Figure 2.3. Therefore, in our analysis we focused our investigation on those lipid species smaller than small VLDL: i.e. IDL and LDL. In addition, we wished to assess the causal relevance of HDL species.

Figure 1 of @holmes2019ldl showing which sizes of lipid trait enter the arterial intima.

Figure 2.3: Figure 1 of Holmes and Ala-Korpela (2019) showing which sizes of lipid trait enter the arterial intima.

3 Data

The MR Challenge data consists of summary data of the association of 148 genotypes with 118 lipid traits and 7 outcomes. We have trimmed the number of exposures used. Our low-density lipoprotein (LDL) analysis consists of 11 exposures whilst we have used 14 lipid traits associated with high-density lipoproteins (HDL).

The atherogenic lipid traits we investigated are;

  • Concentration of IDL particles
  • Free cholesterol in IDL
  • Cholesterol esters in large LDL
  • Free cholesterol in large LDL
  • Phospholipids in IDL
  • Concentration of Large LDL particles
  • Phospholipids in large LDL
  • Cholesterol esters in medium LDL
  • Concentration of medium LDL particles
  • Phospholipids in medium LDL
  • Concentration of small LDL particles.

The lipid traits related to HDL that we investigated are;

  • Cholesterol esters in large HDL
  • Concentration of large HDL particles
  • Phospholipids in large HDL
  • Cholesterol esters in medium HDL
  • Free cholesterol in medium HDL
  • Concentration of medium HDL particles
  • Phospholipids in medium HDL
  • Concentration of small HDL particles
  • Triglycerides in small HDL
  • Cholesterol esters in very large HDL
  • Free cholesterol in very large HDL
  • Concentration of very large HDL particles
  • Phospholipids in very large HDL
  • Triglycerides in very large HDL.

4 Analysis Methods

We used the inverse variance weighted (IVW) method to estimate the causal effect in summary-level data (Burgess, Butterworth, and Thompson 2013). The IVW model is denoted in equation (4.1) where for a genotype \(j\), \(\widehat\Gamma_j\) represents the estimated genotype-outcome associations, \(\widehat\gamma_j\) represents the estimated genotype-phenotype associations, and \(\sigma_{yj}\) represents the estimated standard errors of the genotype-outcome associations.

\[ \frac{\hat{\Gamma}_{j}}{\sigma_{y_j}} = \frac{\beta\gamma_j}{\sigma_{y_j}} + \varepsilon_j, \quad \varepsilon_j \sim N(0,1) \tag{4.1} \]

We perform sensitivity analysis for our IVW estimates using the MR-Egger model (Bowden, Davey Smith, and Burgess 2015). The MR-Egger model is an extension of the IVW model (4.1) which includes the average pleiotropic effect as an intercept. By convention the residual variance is constrained to be greater than 1, which usually means that the MR-Egger model gives larger standard errors on its estimated causal effect than the IVW model. Equation (4.2) denotes the MR-Egger model.

\[ \frac{\hat{\Gamma}_{j}}{\sigma_{y_j}} = \frac{\beta_0}{\sigma_{y_j}} + \frac{\beta\gamma_j}{\sigma_{y_j}} + \varepsilon_j, \quad \varepsilon_j \sim N(0,\sigma^2) \tag{4.2} \] We also investigated causal effects of lipid traits adjusted for other traits using the multivariable MR (MVMR) method (Burgess, Dudbridge, and Thompson 2015) as shown in equation (4.3). \[ \hat{\Gamma}_j = \beta_1\hat{\gamma}_{1,j} + \beta_2\hat{\gamma}_{2,j} + \varepsilon_j, \quad \varepsilon_j \sim N(0,\sigma^2). \tag{4.3} \]

To select which of the 148 genotypes to include in our analysis we took two approaches:

  1. We selected genotypes with genome-wide significant p-values (\(p < 5 \times 10^{-8}\)) with the specific lipid trait of interest. These results are in Section 4;
  2. We also selected genotypes based upon their individual contribution towards the Q-statistic, for either the IVW or MR-Egger model (Bowden et al. 2018). Therefore, in this case we selected genotypes with Q-statistic p-values \(\geq 0.05\). These results are in Section 5.

We investigated the causal effect of each selected lipid trait and we then perform a meta-analysis our causal effect estimates according to their size and their specific trait.

5 Results

5.1 GWAS-significant genotype-phenotype associations

5.1.1 IVW estimates

IVW estimates of the causal effect of lipid fractions related to LDLs are shown in Table 5.1 and Figure 5.1. We can see from the SNPs column that the number of selected SNPs is small for all the traits. The MR point estimates are positive which means that on average a higher level of LDL related lipid is related to a higher risk of ischemic stroke (since the associations are risk estimates on the log scale). However, the estimates all report confidence intervals spanning the null. The positive point estimates concur with our scientific expectations as the atherogenic characteristic increase the risk of ischemic stroke. However, more instruments would be needed to increase the statistical power of these estimates.

Table 5.1: IVW estimates for exposures related LDLs
Exposures Estimate SE LI UI Pval SNPs
Free cholesterol in IDL 0.0168 0.0723 -0.1249 0.1584 0.8165 4
Concentration of IDL particles 0.0137 0.058 -0.1 0.1274 0.8131 5
Phospholipids in IDL 0.0178 0.0756 -0.1303 0.1659 0.8138 4
Cholesterol esters in large LDL 0.0446 0.0659 -0.0846 0.1738 0.4988 3
Free cholesterol in large LDL 0.0926 0.0863 -0.0765 0.2616 0.2831 4
Concentration of large LDL particles 0.0483 0.0651 -0.0792 0.1759 0.4578 3
Phospholipids in large LDL 0.0411 0.0742 -0.1043 0.1864 0.5797 6
Cholesterol esters in medium LDL 0.0423 0.0674 -0.0898 0.1744 0.5306 3
Concentration of medium LDL particles 0.0445 0.0675 -0.0879 0.1768 0.5101 3
Phospholipids in medium LDL 0.091 0.0923 -0.09 0.272 0.3247 4
Concentration of small LDL particles 0.0344 0.0738 -0.1102 0.179 0.6408 4
IVW Estimates for LDLs

Figure 5.1: IVW Estimates for LDLs

Table 5.2 and Figure 5.2 show causal estimates of lipid fractions related to the HDLs. These point estimates are generally negative with confidence intervals spanning the null. In the Table shaded cells indicate a small number of genotypes, which is of interest because the MR-Egger estimator cannot be performed with these lipid traits due to the low number of instruments.

Table 5.2: IVW estimates for exposures related HDLs
Exposures Estimate SE LI UI Pval SNPs
Cholesterol esters in large HDL -0.0134 0.048 -0.1075 0.0807 0.7807 3
Concentration of large HDL particles -0.0179 0.0471 -0.1102 0.0745 0.7045 3
Phospholipids in large HDL -0.0222 0.0505 -0.1211 0.0767 0.66 3
Cholesterol esters in medium HDL -0.0399 0.1071 -0.2497 0.17 0.7096 1
Free cholesterol in medium HDL -0.0486 0.1306 -0.3045 0.2073 0.7096 1
Concentration of medium HDL particles -0.1014 0.1221 -0.3407 0.1379 0.4064 1
Phospholipids in medium HDL -0.1212 0.146 -0.4074 0.1649 0.4064 1
Concentration of small HDL particles -0.0225 0.0524 -0.1253 0.0802 0.6673 2
Triglycerides in small HDL -0.0268 0.0901 -0.2035 0.1498 0.7659 2
Cholesterol esters in very large HDL -0.0048 0.056 -0.1145 0.1049 0.932 2
Free cholesterol in very large HDL 8e-04 0.0482 -0.0937 0.0952 0.9875 3
Concentration of very large HDL particles -0.0042 0.0479 -0.098 0.0896 0.9297 2
Phospholipids in very large HDL -0.0032 0.0393 -0.0802 0.0738 0.9357 4
Triglycerides in very large HDL -0.0121 0.0385 -0.0875 0.0633 0.7524 2
IVW Estimates for HDLs

Figure 5.2: IVW Estimates for HDLs

5.1.2 MR-Egger estimates

In this section, we use the MR-Egger model to perform sensitivity analysis for the IVW estimates. Table 5.3 shows estimates from the MR-Egger model for lipid fractions related to LDLs. In general the point estimates are positive and larger in magnitude than the IVW estimates. The one exception is the point estimate for phospholipids in medium LDL, which shows a negative causal estimate with a confidence interval spanning the null. We find no strong evidence against the null hypothesis of no pleiotropy, since the estimates of the intercepts (AvgPleio) are all close to the null with large p-values.

Table 5.3: MR-Egger estimates for LDL related phenotypes
Estimate SE LI UI Pval
Free cholesterol in IDL
AvgPleio -0.0156 0.0277 -0.0699 0.0387 0.5738
Causal 0.1787 0.2994 -0.4081 0.7654 0.5506
Concentration of IDL particles
AvgPleio -0.0151 0.0257 -0.0656 0.0353 0.5563
Causal 0.1674 0.2688 -0.3594 0.6942 0.5335
Phospholipids in IDL
AvgPleio -0.0196 0.0299 -0.0781 0.0390 0.5128
Causal 0.2321 0.3380 -0.4304 0.8945 0.4923
Concentration of IDL particles
AvgPleio -0.0462 0.0627 -0.1691 0.0766 0.4608
Causal 0.5433 0.6840 -0.7973 1.8840 0.4270
Free cholesterol in large LDL
AvgPleio 0.0020 0.0339 -0.0645 0.0685 0.9531
Causal 0.0688 0.4174 -0.7493 0.8870 0.8690
Concentration of large LDL particles
AvgPleio -0.0451 0.0470 -0.1372 0.0471 0.3377
Causal 0.5262 0.5064 -0.4664 1.5188 0.2988
Phospholipids in large LDL
AvgPleio -0.0092 0.0268 -0.0618 0.0433 0.7307
Causal 0.1594 0.3535 -0.5334 0.8523 0.6519
Cholesterol esters in medium LDL
AvgPleio -0.0531 0.0828 -0.2152 0.1091 0.5214
Causal 0.6299 0.9232 -1.1795 2.4394 0.4950
Concentration of medium LDL particles
AvgPleio -0.0536 0.0701 -0.1911 0.0839 0.4447
Causal 0.6382 0.7838 -0.8979 2.1743 0.4155
Phospholipids in medium LDL
AvgPleio 0.0210 0.0606 -0.0977 0.1397 0.7287
Causal -0.1769 0.7799 -1.7056 1.3518 0.8206
Concentration of small LDL particles
AvgPleio -0.0046 0.0452 -0.0932 0.0840 0.9193
Causal 0.0853 0.5099 -0.9140 1.0846 0.8672

Sensitivity analysis of the lipid traits related to HDLs in Table 5.4 show negative point estimates with confidence intervals spanning the null. This would fit with a narrative of HDL either being protective or having a null effect on the risk of ischemic stroke, however the evidence against the null hypothesis is very weak.

Table 5.4: MR-Egger estimates for HDL related phenotypes
Estimate SE LI UI Pval
Cholesterol esters in large HDL
AvgPleio 0.0407 0.0442 -0.0459 0.1274 0.3570
Causal -0.2675 0.2801 -0.8165 0.2814 0.3395
Concentration of large HDL particles
AvgPleio 0.0359 0.0328 -0.0284 0.1002 0.2741
Causal -0.2350 0.2040 -0.6348 0.1649 0.2495
Phospholipids in large HDL
AvgPleio 0.0282 0.0265 -0.0237 0.0800 0.2869
Causal -0.2019 0.1761 -0.5471 0.1433 0.2517
Free cholesterol in very large HDL
AvgPleio 0.0249 0.0285 -0.0310 0.0808 0.3819
Causal -0.1686 0.1996 -0.5599 0.2226 0.3983
Phospholipids in very large HDL
AvgPleio 0.0182 0.0253 -0.0314 0.0678 0.4714
Causal -0.1080 0.1508 -0.4037 0.1876 0.4737

5.1.3 Meta-analysis of exposure traits

In an attempt to increase statistical power, we performed a meta-analysis of the results by lipid category and molecular size.

5.1.3.1 Risk factor categories

Table 5.5 shows results from the meta-analysis of the IVW estimates. The point estimates show null or positive estimates with confidence intervals spanning the null, however given the number of instruments the estimates show how despite the meta-analysis, the analyses yield imprecise causal estimates. Results in Table 5.6 are similar to the risk factor categories of LDLs. The point estimates show null or negative estimates with confidence intervals spanning the null.

Table 5.5: Meta-analysis of IVW estimates from LDL related phenotypes
N Estimate SE LI UI Pval
Concentration 4 0.0337 0.0327 -0.0304 0.0978 0.3025
Phospholipids 3 0.0449 0.0459 -0.0452 0.1349 0.3288
Free Cholesterol 2 0.0481 0.0554 -0.0606 0.1567 0.3858
Cholesterol esters 2 0.0435 0.0471 -0.0489 0.1358 0.3562
Table 5.6: Meta-analysis of IVW estimates for HDL related phenotypes
N Mean SE LI UI Pval
Cholesterol Esters 3 -0.0129 0.0345 -0.0805 0.0547 0.7088
Free Cholesterol 2 -0.0051 0.0452 -0.0937 0.0835 0.9098
Concentration 4 -0.0189 0.0275 -0.0729 0.0351 0.4929
phospholipids 3 -0.0203 0.0300 -0.0790 0.0385 0.4991
Triglycerides 2 -0.0144 0.0354 -0.0838 0.0550 0.6848

5.1.3.2 Sizes

In this section we present result meta-analysing over the lipid traits for each size of molecule. LDL results are shown in Table 5.7. Similar to the risk factor categories in 5.5 the sizes have null or positive causal estimate with no strong evidence against the null. Table 5.7 shows that the medium and large sized LDLs have a greater magnitude of effect than the intermediates sizes which matches our scientific rationale.

Table 5.7: Meta-analysis of IVW estimates for LDL sizes
N Mean SE LI UI P-val
Intermediate 3 0.0157 0.0388 -0.0604 0.0918 0.6864
Medium 3 0.0534 0.0424 -0.0296 0.1365 0.2073
Large 4 0.0531 0.0358 -0.0169 0.1232 0.1372

For the HDL sizes in Table 5.8 the point estimates show negative point estimates close to the null however none of the estimates are statistically significant. We find that the pooled point estimated for the very large HDL molecules is closest to the null.

Table 5.8: Meta-analysis of IVW estimates for HDL Sizes
N Mean SE LI UI P-val
Small 2 -0.0236 0.0453 -0.1124 0.0652 0.6026
Medium 4 -0.0724 0.0620 -0.1940 0.0492 0.2431
Large 3 -0.0177 0.0280 -0.0725 0.0372 0.5273
Very Large 5 -0.0053 0.0200 -0.0444 0.0339 0.7913

5.1.4 Multivariate Meta-analysis of MR-Egger estimates

We also present equivalent meta-analysis estimates for our MR-Egger results, the difference being that we use multivariate meta-analysis since the MR-Egger model returns 2 parameters (slope and intercept). The low number of MR-Egger estimates from the exposure traits related to HDL-related phenotypes meant that we could not report any estimates. The results of the risk factor categories in Table 5.9 show no strong evidence against the null of no pleiotropy. Cholesterol esters show a greater effect on ischemic stroke within the different LDL traits. Table 5.10 shows no strong evidence against the null hypothesis of no pleiotropy. The summary point estimates are bigger in magnitude than the IVW estimates in Table 5.7. The intermediate sized traits returned a positive estimate suggesting an increased risk of ischemic stroke.

Table 5.9: Results from Multivariate Meta-analysis of LDL related phenotypes
N AVg Pleio Slope Pval(Pleio) Pval(Est)
Free Cholesterol 4 -0.0086 0.1414 0.6902 0.5612
Concentration 3 -0.0213 0.2471 0.2737 0.2339
Phospholipids 2 -0.0104 0.1639 0.5824 0.4819
Cholesterol esters 2 -0.0487 0.5740 0.3295 0.2963
Table 5.10: Multivariate meta-analysis of MR-Egger estimates for LDL sizes
N Avg Pleio Estimate Pval(Pleio) Pval(Est)
Intermediate 3 -0.0165 0.1789 0.2995 0.0000
Medium 3 -0.0208 0.3346 0.6044 0.4806
Large 4 -0.0146 0.2469 0.4269 0.2721

5.2 Selecting genetic variants using Q-statistics

As most of the results presented hitherto have lacked statistical power, we therefore sought to increase the number of SNPs used as instruments, with the hope of increasing power, through the use of Q-statistics. This section presents results using genotypes based on their contribution to the Q-statistic in the IVW models.

5.2.1 IVW estimates

The estimates from the IVW model are given in Tables 5.11 and 5.12.

MR estimates in Table 5.11 show positive point estimates which agree with our expectations, the estimates also show significance (confidence intervals exclude null). However a major limitation here is that the SNPs used in the models are the same or largely overlapping, meaning that the MR estimates are unlikely to be valid. This is because the estimates from the genotype-outcome association are the same in each model with the genotype-exposure association varying for each lipid trait leading to a simple scaling of the genotype-outcome associations for each trait. This makes it impossible to disentangle which trait has a true causal effect and which is confounded by using the same SNPs. This issue has been discussed previously by Holmes, Ala-Korpela, and Davey Smith (2017).
Table 5.11: IVW estimates for exposures related LDLs
Exposures Estimates SE LI UI Pval SNPs
Free cholesterol in IDL 0.0965 0.0338 0.0302 0.1629 0.0043 133
Concentration of IDL particles 0.0991 0.0326 0.0353 0.163 0.0024 133
Phospholipids in IDL 0.102 0.0341 0.0352 0.1688 0.0028 134
Cholesterol esters in large LDL 0.1068 0.0358 0.0366 0.177 0.0029 133
Free cholesterol in large LDL 0.0984 0.0361 0.0277 0.1692 0.0064 133
Concentration of large LDL particles 0.1039 0.0343 0.0367 0.1712 0.0025 133
Phospholipids in large LDL 0.0991 0.0359 0.0287 0.1695 0.0058 133
Cholesterol esters in medium LDL 0.1105 0.0363 0.0393 0.1818 0.0024 133
Concentration of medium LDL particles 0.107 0.0357 0.037 0.1769 0.0027 133
Phospholipids in medium LDL 0.0946 0.0373 0.0214 0.1678 0.0113 133
Concentration of small LDL particles 0.0993 0.0362 0.0284 0.1703 0.0061 133

Results from Table 5.12 show causal estimates of lipid traits related to HDL with the increased number of SNPs. The point estimates are generally null or negative with small p-values. Triglycerides in small HDL show a positive causal effect on ischemic stroke with statistical significance. We show the MR-Egger model for this estimate below. Precisely the same issue as noted above applies in this setting where SNPs used in the IVs overlap between exposures, meaning the effect estimates from MR are unlikely to be valid.

Table 5.12: IVW estimates for exposures related HDLs
Exposures Estimates SE LI UI Pval SNPs
Cholesterol esters in large HDL -0.0967 0.0301 -0.1557 -0.0378 0.0013 134
Concentration of large HDL particles -0.0879 0.0302 -0.1471 -0.0288 0.0036 133
Phospholipids in large HDL -0.0924 0.0312 -0.1535 -0.0313 0.0031 133
Cholesterol esters in medium HDL -0.1651 0.0419 -0.2472 -0.083 1e-04 135
Free cholesterol in medium HDL -0.1531 0.0442 -0.2396 -0.0665 5e-04 135
Concentration of medium HDL particles -0.1391 0.0447 -0.2267 -0.0515 0.0019 135
Phospholipids in medium HDL -0.1567 0.0454 -0.2456 -0.0678 6e-04 135
Concentration of small HDL particles -0.0656 0.0409 -0.1458 0.0146 0.1091 135
Triglycerides in small HDL 0.1583 0.0411 0.0778 0.2387 1e-04 134
Cholesterol esters in very large HDL -0.0832 0.0385 -0.1586 -0.0078 0.0305 131
Free cholesterol in very large HDL -0.0827 0.0348 -0.1509 -0.0145 0.0174 132
Concentration of very large HDL particles -0.0692 0.0348 -0.1374 -0.001 0.0467 131
Phospholipids in very large HDL -0.0655 0.0299 -0.1242 -0.0068 0.0288 132
Triglycerides in very large HDL 0.026 0.0301 -0.033 0.085 0.3874 134

5.2.2 MR-Egger estimates

Table 5.13 shows results from the MR-Egger model for the lipid fractions related to LDLs. The results show no strong evidence against no pleiotropy with values from the AvgPleio being close to the null with large p-values.

Note that precisely the same issue as noted above applies in this setting where SNPs used in the IVs overlap between exposures, meaning that the effect estimates from MR are unlikely to be valid.

Table 5.13: MR-Egger estimates for LDLs
Estimate SE LI UI Pval
Free cholesterol in IDL
AvgPleio 0.0010 0.0014 -0.0016 0.0037 0.4465
Causal 0.0689 0.0497 -0.0285 0.1662 0.1655
Concentration of IDL particles
AvgPleio 0.0019 0.0013 -0.0007 0.0045 0.1482
Causal 0.0501 0.0471 -0.0421 0.1423 0.2872
Phospholipids in IDL
AvgPleio 0.0025 0.0014 -0.0002 0.0051 0.0689
Causal 0.0349 0.0502 -0.0635 0.1333 0.4873
Concentration of IDL particles
AvgPleio 0.0006 0.0014 -0.0021 0.0032 0.6729
Causal 0.0906 0.0525 -0.0123 0.1935 0.0844
Free cholesterol in large LDL
AvgPleio 0.0018 0.0014 -0.0009 0.0044 0.1874
Causal 0.0474 0.0529 -0.0564 0.1512 0.3706
Concentration of large LDL particles
AvgPleio 0.0010 0.0013 -0.0016 0.0037 0.4469
Causal 0.0763 0.0500 -0.0217 0.1742 0.1268
Phospholipids in large LDL
AvgPleio 0.0013 0.0014 -0.0013 0.0040 0.3232
Causal 0.0608 0.0528 -0.0427 0.1643 0.2495
Cholesterol esters in medium LDL
AvgPleio 0.0004 0.0014 -0.0023 0.0031 0.7494
Causal 0.0978 0.0540 -0.0080 0.2036 0.0701
Concentration of medium LDL particles
AvgPleio 0.0000 0.0014 -0.0027 0.0027 0.9774
Causal 0.1058 0.0532 0.0016 0.2101 0.0465
Phospholipids in medium LDL
AvgPleio 0.0015 0.0014 -0.0013 0.0042 0.2917
Causal 0.0506 0.0560 -0.0591 0.1604 0.3657
Concentration of small LDL particles
AvgPleio 0.0003 0.0014 -0.0024 0.0030 0.8324
Causal 0.0907 0.0546 -0.0162 0.1976 0.0965

Results from Table 5.14 show some statistically significant average pleiotropic estimates and non-significant estimates, however the estimates are close to the null. The point estimate of triglycerides in small HDL shows an increase compared to its IVW estimate, and we note that both are positive. There was no evidence against no pleiotropy for this estimate. Figure 5.3 shows the association of triglycerides in small HDL and ischemic stroke. Note that precisely the same issue as noted above applies in this setting where SNPs used in the IVs overlap between exposures, meaning that the effect estimates from MR are unlikely to be valid.

Table 5.14: MR-Egger estimates for HDLs
Estimate SE LI UI Pval
Cholesterol esters in large HDL
AvgPleio -0.0043 0.0013 -0.0067 -0.0018 0.0006
Causal -0.0016 0.0409 -0.0818 0.0787 0.9695
Concentration of large HDL particles
AvgPleio -0.0029 0.0012 -0.0053 -0.0004 0.0213
Causal -0.0258 0.0405 -0.1051 0.0535 0.5236
Phospholipids in large HDL
AvgPleio -0.0040 0.0013 -0.0065 -0.0016 0.0014
Causal -0.0004 0.0424 -0.0836 0.0827 0.9918
Cholesterol esters in medium HDL
AvgPleio -0.0028 0.0013 -0.0053 -0.0002 0.0328
Causal -0.0756 0.0593 -0.1918 0.0406 0.2024
Free cholesterol in medium HDL
AvgPleio -0.0016 0.0013 -0.0041 0.0009 0.2164
Causal -0.0993 0.0620 -0.2208 0.0222 0.1094
Concentration of medium HDL particles
AvgPleio -0.0019 0.0012 -0.0044 0.0005 0.1190
Causal -0.0754 0.0605 -0.1940 0.0433 0.2130
Phospholipids in medium HDL
AvgPleio -0.0035 0.0013 -0.0060 -0.0010 0.0054
Causal -0.0362 0.0627 -0.1591 0.0866 0.5632
Concentration of small HDL particles
AvgPleio -0.0003 0.0012 -0.0026 0.0020 0.7868
Causal -0.0576 0.0505 -0.1567 0.0414 0.2539
Triglycerides in small HDL
AvgPleio -0.0004 0.0014 -0.0032 0.0024 0.7743
Causal 0.1719 0.0629 0.0486 0.2952 0.0063
Cholesterol esters in very large HDL
AvgPleio -0.0009 0.0012 -0.0032 0.0015 0.4675
Causal -0.0605 0.0496 -0.1577 0.0367 0.2226
Free cholesterol in very large HDL
AvgPleio -0.0032 0.0012 -0.0054 -0.0009 0.0067
Causal -0.0117 0.0435 -0.0971 0.0736 0.7875
Concentration of small HDL particles
AvgPleio -0.0015 0.0012 -0.0038 0.0008 0.2037
Causal -0.0362 0.0434 -0.1213 0.0490 0.4051
Phospholipids in very large HDL
AvgPleio -0.0032 0.0012 -0.0055 -0.0010 0.0053
Causal -0.0021 0.0376 -0.0758 0.0716 0.9561
Triglycerides in very large HDL
AvgPleio -0.0007 0.0012 -0.0029 0.0016 0.5665
Causal 0.0382 0.0369 -0.0341 0.1104 0.3007
Association of triglycerides in small HDLs and ischemic stroke. Caution should be applied in interpreting these data as the SNPs used are pleiotropic.

Figure 5.3: Association of triglycerides in small HDLs and ischemic stroke. Caution should be applied in interpreting these data as the SNPs used are pleiotropic.

5.2.3 Meta-analysis of exposure traits

Selection of SNPs through their individual contribution to the Q-statistics yielded a large number of instruments and therefore statistical power. As before, we sought to combine categories of traits to increase power further.

The meta-analysis exposure trait estimates in Table 5.15 are statistically significant, with the estimate for cholesterol esters being the largest in magnitude. The results in Table 5.16 show similar characteristics in terms of statistical significance, apart from risk categories of triglycerides related to HDLs which show a positive effect estimate with a large p-values.

Note that precisely the same issue as noted above applies in this setting where SNPs used in the IVs overlap between exposures, meaning that the effect estimates from MR are unlikely to be valid.

Table 5.15: Meta-analysis of risk factors for LDLs using IVW estimates
N Estimate SE LI UI Pval
Free Cholesterol 2 0.0974 0.0247 0.0490 0.1457 1e-04
Concentration 4 0.1022 0.0173 0.0683 0.1361 0e+00
Phospholipids 3 0.0988 0.0206 0.0584 0.1392 0e+00
Cholesterol esters 2 0.1086 0.0255 0.0587 0.1586 0e+00
Table 5.16: Meta-analysis of risk factors for HDLs using IVW estimates
N Estimate SE LI UI Pval
Cholesterol esters 3 -0.1095 0.0209 -0.1505 -0.0686 0.0000
Concentration 4 -0.0869 0.0182 -0.1225 -0.0512 0.0000
Phospholipids 3 -0.1023 0.0207 -0.1429 -0.0618 0.0000
Triglycerides 2 0.0892 0.0661 -0.0403 0.2187 0.1771

5.2.4 Multivariate Meta-analysis of MR-Egger estimates

We explore the results from the MR-Egger model using multivariate meta-analysis, the analysis is performed on the risk factor categories for LDLs (Table 5.17) and HDLs (Table 5.18). Results from Table 5.17 show no statistically significant intercept estimates except for the phospholipids risk factor categories. The slope for the cholesterol esters and concentration groups show statistical significance.

From Table 5.18, the risk factor group of triglycerides in HDLs show estimates of the intercept and slope of no statistical significance. The other risk factor groups have statistically significant intercept estimates and non significant slopes.

We note that the same SNPs have been used multiple times for each exposures, and then fitted in a multivariate meta-analysis model. To our minds the multivariate meta-analysis of these MR-Egger estimates cannot yield valid causal estimates given that in effect the same SNPs are used to instrument different exposures.

Table 5.17: Results from Multivariate Meta-analysis LDL risk factor categories
N AVg Pleio Slope Pval(Pleio) Pval(Est)
Free Cholesterol 4 0.0013 0.0616 0.9504 0.8002
Concentration 3 0.0008 0.0785 0.2150 0.0020
Phospholipids 2 0.0018 0.0482 0.0256 0.1140
Cholesterol esters 2 0.0005 0.0941 0.5997 0.0124
Table 5.18: Results from Multivariate Meta-analysis HDL risk factor categories
N AVg Pleio Slope Pval(Pleio) Pval(Est)
Cholesterol esters 3 -0.0026 -0.0398 0.0127 0.2132
Concentration 4 -0.0016 -0.0433 0.0078 0.0661
Phospholipids 3 -0.0035 -0.0117 0.0000 0.6685
Triglycerides 2 -0.0005 0.0954 0.5480 0.1498

5.3 MVMR of Concentration

We were additionally interested in whether mutually adjusting for lipid traits in our IVW models would affect our causal effects. We therefore fitted MVMR models as shown in Figure 5.4.

Figure 5.4: Directed acyclic graph (DAG) of Multivariable MR analysis of concentration risk factor categories in LDLs.

We fitted MVMR models using the MVMR package (W. Spiller, Bowden, and Sanderson 2019). We investigated invalid instruments using each genotype’s contribution to the adjusted Q-statistic (Sanderson et al. 2018) and only found 2 outlying genotypes, hence for simplicity we performed these analyses using all 148 genotypes.

5.3.1 LDL

The results of the MVMR analysis for the concentration of LDL particles is shown in Table 5.19. Concentration for IDLs and medium LDL particles have negative causal estimates, with the concentration of IDL particles estimate giving a small p-value.

The column of Q-statistics (Q) shows evidence of heterogeneity. In the MVMR context, this suggests that there is enough information from the instruments to predict the exposures. The test based on the adjusted Q statistic (\(p(Q_A) = 0.99\)) shows evidence of pleiotropy, which can suggest that there are instruments which are related to the outcome through other pathways (more exposures).

Note that precisely the same issue as noted above applies in this setting where SNPs used in the IVs overlap between exposures, meaning that the effect estimates from MVMR are unlikely to be valid.

Table 5.19: IVW Estimates from Multivariate MR using 148 SNPs
Estimate Std. Error t value Pr(>|t|) Q
Concentration of IDL particles -1.7362 0.2718 -6.3866 0.0000 10.5582
Concentration of Large LDL particles 2.1343 0.3712 5.7491 0.0000 93.5366
Concentration of Medium LDL particles -0.0861 0.3069 -0.2807 0.7793 105.3149
Concentration of Small particles 0.8309 0.0900 9.2292 0.0000 1187.1912

5.3.2 HDL

The results from the MVMR analysis for concentrations of HDL particles are shown in Table 5.20. These show positive causal estimates for medium and very large HDL, however the estimates should be interpreted with caution. The Q-statistic column suggests that there is enough information from the instruments to predict the exposure variables. The adjusted Q-statistic for this model gave a p-value of 0.88, which suggests that there are instruments which affect ischemic stroke through other exposures.

Note that precisely the same issue as noted above applies in this setting where SNPs used in the IVs overlap between exposures, meaning that the effect estimates from MVMR are unlikely to be valid.

Table 5.20: IVW Estimates from Multivariate MR for HDLs using 148 SNPs
Estimate Std. Error t value Pr(>|t|) Q
Concentration of Large HDL particles -0.1188 0.1611 -0.7371 0.4622 26.0627
Concentration of Medium HDL particles 0.5846 0.1928 3.0329 0.0029 29.2731
Concentration of Small HDL particles -0.2440 0.1108 -2.2026 0.0292 33.5726
Concentration of Very Large HDL particles 0.5252 0.1097 4.7892 0.0000 28.5134

6 Discussion and Conclusions

We have presented estimates of the effect of different traits related to LDL and HDL on the risk of ischemic stroke and attempted to group traits based on characteristics of lipid or size.

Selecting instruments for each trait based on SNPs that surpass conventional levels of genome-wide significance led to no meaningful conclusions being drawn based on limited statistical power.

In contrast, selecting instruments based on their individual contribution to the Q-statistics led to instruments consisting of many more SNPs and therefore greater statistical power. In this setting, we identified many strong MR associations, which could be construed as evidence of causation. However this approach has a major flaw insofar as the same SNPs were used as IVs for multiple exposures. In this context, it becomes impossible to disentangle which of the traits are causal because in essence the genotype-disease association is fixed and all that varies in each MR model is the genotype-exposure association. Furthermore, in the context of MVMR, we cannot see how using the same SNPs for multiple exposures in the same model can lead to valid MR estimates.

Thus, this MR data challenge nicely demonstrates a key challenge in conducting MR of highly correlated traits. Selecting SNPs on the basis of GWAS significance derived from modestly-sized GWAS studies is likely to lead to poorly performing instruments on the basis of lack of statistical power. Inclusion of greater numbers of SNPs by more permissive entry criteria, especially in the context of only 148 SNPs in total across the lipid traits in this data example, means that we in essence use the same SNPs multiple times which means that we cannot use MR in either the univariate or multivariate setting to disentangle which traits are driving the causal relationship. We are not therefore able to make any meaningful conclusions about the causal role of these traits in ischaemic stroke. Our analyses highlight a potential misuse of genetic instruments where the same instruments are used for multiple exposures, which might lead to erroneous interpretations.

7 Software

The R Markdown code is available by clicking on the Code button in the top righthand corner of this document. Click here to navigate to the top of the document.

References

Bowden, Jack, George Davey Smith, and Stephen Burgess. 2015. “Mendelian Randomization with Invalid Instruments: Effect Estimation and Bias Detection Through Egger Regression.” International Journal of Epidemiology 44 (2): 512–25.

Bowden, Jack, Wesley Spiller, Fabiola Del Greco M, Nuala Sheehan, John Thompson, Cosetta Minelli, and George Davey Smith. 2018. “Improving the Visualization, Interpretation and Analysis of Two-Sample Summary Data Mendelian Randomization via the Radial Plot and Radial Regression.” International Journal of Epidemiology 47 (4): 1264–78.

Burgess, Stephen, Adam Butterworth, and Simon G Thompson. 2013. “Mendelian Randomization Analysis with Multiple Genetic Variants Using Summarized Data.” Genetic Epidemiology 37 (7): 658–65.

Burgess, Stephen, Frank Dudbridge, and Simon G Thompson. 2015. “Re:“Multivariable Mendelian Randomization: The Use of Pleiotropic Genetic Variants to Estimate Causal Effects".” American Journal of Epidemiology 181 (4): 290–91.

Davey Smith, G., and S. Ibrahim. 2003. “‘Mendelian randomization’: can genetic epidemiology contribute to understanding environmental determinants of disease?” International Journal of Epidemiology 32 (1): 1–22.

Holmes, Michael V, and Mika Ala-Korpela. 2019. “What is ‘LDL cholesterol’?” Nature Reviews Cardiology 16: 197–98.

Holmes, Michael V., Mika Ala-Korpela, and George Davey Smith. 2017. “Mendelian randomization in cardiometabolic disease: challenges in evaluating causality.” Nature Reviews Cardiology 14: 577–90. https://doi.org/10.1038/nrcardio.2017.78.

Holmes, M. V., and G. Davey Smith. 2018. “Challenges in Interpreting Multivariable Mendelian Randomization: Might ‘Good Cholesterol’ Be Good After All?” 71 (2): 149–53.

Sanderson, Eleanor, George Davey Smith, Frank Windmeijer, and Jack Bowden. 2018. “An examination of multivariable Mendelian randomization in the single-sample and two-sample summary data settings.” International Journal of Epidemiology, December. https://doi.org/10.1093/ije/dyy262.

Spiller, Wes, Jack Bowden, and Eleanor Sanderson. 2019. MVMR: R package to perform multivariable Mendelian randomization analyses. https://github.com/WSpiller/MVMR.

Spiller, Wes, Jack Bowden, and Verena Zuber. 2019. MRChallenge2019: An R package containing data for the MR Conference Data Challenge 2019. https://github.com/WSpiller/MRChallenge2019.

