This document provides a brief tutorial to analyzing twin data using the mets package:

\(\newcommand{\cov}{\mathbb{C}\text{ov}} \newcommand{\cor}{\mathbb{C}\text{or}} \newcommand{\var}{\mathbb{V}\text{ar}} \newcommand{\E}{\mathbb{E}} \newcommand{\unitfrac}[2]{#1/#2} \newcommand{\n}{}\) ’ The development version may be installed from github:

# install.packages("remotes")
remotes::install_github("kkholst/mets", dependencies="Suggests")

Twin analysis, continuous traits

In the following we examine the heritability of Body Mass Indexkorkeila_bmi_1991hjelmborg_bmi_2008, based on data on self-reported BMI-values from a random sample of 11,411 same-sex twins. First, we will load data

data("twinbmi")
head(twinbmi)
#>   tvparnr      bmi      age gender zyg id num
#> 1       1 26.33289 57.51212   male  DZ  1   1
#> 2       1 25.46939 57.51212   male  DZ  1   2
#> 3       2 28.65014 56.62696   male  MZ  2   1
#> 5       3 28.40909 57.73097   male  DZ  3   1
#> 7       4 27.25089 53.68683   male  DZ  4   1
#> 8       4 28.07504 53.68683   male  DZ  4   2

The data is on long format with one subject per row.

  • tvparnr: twin id
  • bmi: Body Mass Index (\(\mathrm{kg}/{\mathrm{m}^2}\))
  • age: Age (years)
  • gender: Gender factor (male,female)
  • zyg: zygosity (MZ, DZ)

We transpose the data allowing us to do pairwise analyses

twinwide <- fast.reshape(twinbmi, id="tvparnr",varying=c("bmi"))
head(twinwide)
#>    tvparnr     bmi1      age gender zyg id num     bmi2
#> 1        1 26.33289 57.51212   male  DZ  1   1 25.46939
#> 3        2 28.65014 56.62696   male  MZ  2   1       NA
#> 5        3 28.40909 57.73097   male  DZ  3   1       NA
#> 7        4 27.25089 53.68683   male  DZ  4   1 28.07504
#> 9        5 27.77778 52.55838   male  DZ  5   1       NA
#> 11       6 28.04282 52.52231   male  DZ  6   1 22.30936

Next we plot the association within each zygosity group

We here show the log-transformed data which is slightly more symmetric and more appropiate for the twin analysis (see Figure @ref(fig:scatter1) and @ref(fig:scatter2))

mz <- log(subset(twinwide, zyg=="MZ")[,c("bmi1","bmi2")])
scatterdens(mz)
Scatter plot of logarithmic BMI measurements in MZ twins

Scatter plot of logarithmic BMI measurements in MZ twins

dz <- log(subset(twinwide, zyg=="DZ")[,c("bmi1","bmi2")])
scatterdens(dz)
Scatter plot of logarithmic BMI measurements in DZ twins

Scatter plot of logarithmic BMI measurements in DZ twins

The plots and raw association measures shows considerable stronger dependence in the MZ twins, thus indicating genetic influence of the trait

cor.test(mz[,1],mz[,2], method="spearman")
#> 
#>  Spearman's rank correlation rho
#> 
#> data:  mz[, 1] and mz[, 2]
#> S = 165457624, p-value < 2.2e-16
#> alternative hypothesis: true rho is not equal to 0
#> sample estimates:
#>       rho 
#> 0.6956209
cor.test(dz[,1],dz[,2], method="spearman")
#> 
#>  Spearman's rank correlation rho
#> 
#> data:  dz[, 1] and dz[, 2]
#> S = 2162514570, p-value < 2.2e-16
#> alternative hypothesis: true rho is not equal to 0
#> sample estimates:
#>       rho 
#> 0.4012686

Ńext we examine the marginal distribution (GEE model with working independence)

l0 <- lm(bmi ~ gender + I(age-40), data=twinbmi)
estimate(l0, id=twinbmi$tvparnr)
#>             Estimate  Std.Err    2.5%   97.5%    P-value
#> (Intercept)  23.3687 0.054534 23.2618 23.4756  0.000e+00
#> gendermale    1.4077 0.073216  1.2642  1.5512  2.230e-82
#> I(age - 40)   0.1177 0.004787  0.1083  0.1271 1.499e-133
library("splines")
l1 <- lm(bmi ~ gender*ns(age,3), data=twinbmi)
marg1 <- estimate(l1, id=twinbmi$tvparnr)
dm <- Expand(twinbmi,
        bmi=0,
        gender=c("male"),
        age=seq(33,61,length.out=50))
df <- Expand(twinbmi,
        bmi=0,
        gender=c("female"),
        age=seq(33,61,length.out=50))

plot(marg1, function(p) model.matrix(l1,data=dm)%*%p,
     data=dm["age"], ylab="BMI", xlab="Age",
     ylim=c(22,26.5))
plot(marg1, function(p) model.matrix(l1,data=df)%*%p,
     data=df["age"], col="red", add=TRUE)
legend("bottomright", c("Male","Female"),
       col=c("black","red"), lty=1, bty="n")
Marginal association between BMI and Age for males and females.

Marginal association between BMI and Age for males and females.

Polygenic model

We can decompose the trait into the following variance components

  • \(A\): Additive genetic effects of alleles
  • \(D\): Dominante genetic effects of alleles
  • \(C\): Shared environmental effects
  • \(E\): Unique environmental genetic effects

Dissimilarity of MZ twins arises from unshared environmental effects only, \(\cor(E_1,E_2)=0\) and

dd <- na.omit(twinbmi)
l0 <- twinlm(bmi ~ age+gender, data=dd,
            DZ="DZ", zyg="zyg", id="tvparnr", type="sat")
l <- twinlm(bmi ~ ns(age,1)+gender, data=twinbmi,
           DZ="DZ", zyg="zyg", id="tvparnr", type="cor", missing=TRUE)
summary(l)
#> ____________________________________________________
#> Group 1
#>                         Estimate Std. Error   Z value Pr(>|z|)
#> Regressions:                                                  
#>    bmi.1~ns(age, 1).1    4.16937    0.16669  25.01334   <1e-12
#>    bmi.1~gendermale.1    1.41160    0.07284  19.37839   <1e-12
#> Intercepts:                                                   
#>    bmi.1                22.53618    0.07296 308.87100   <1e-12
#> Additional Parameters:                                        
#>    log(var)              2.44580    0.01425 171.68256   <1e-12
#>    atanh(rhoMZ)          0.78217    0.02290  34.16186   <1e-12
#> ____________________________________________________
#> Group 2
#>                         Estimate Std. Error   Z value Pr(>|z|)
#> Regressions:                                                  
#>    bmi.1~ns(age, 1).1    4.16937    0.16669  25.01334   <1e-12
#>    bmi.1~gendermale.1    1.41160    0.07284  19.37839   <1e-12
#> Intercepts:                                                   
#>    bmi.1                22.53618    0.07296 308.87100   <1e-12
#> Additional Parameters:                                        
#>    log(var)              2.44580    0.01425 171.68256   <1e-12
#>    atanh(rhoDZ)          0.29924    0.01848  16.19580   <1e-12
#> 
#>                        Estimate 2.5%    97.5%  
#> Correlation within MZ: 0.65395  0.62751 0.67889
#> Correlation within DZ: 0.29061  0.25712 0.32341
#> 
#> 'log Lik.' -29020.12 (df=6)
#> AIC: 58052.24 
#> BIC: 58093.29

A formal test of genetic effects can be obtained by comparing the MZ and DZ correlation:

estimate(l,contr(5:6,6))
#>                           Estimate Std.Err   2.5%  97.5%   P-value
#> [atanh(rhoMZ)@1] - [a....   0.4829 0.04176 0.4011 0.5648 6.325e-31
#> 
#>  Null Hypothesis: 
#>   [atanh(rhoMZ)@1] - [atanh(rhoDZ)@3] = 0
l <- twinlm(bmi ~ ns(age,1)+gender, data=twinbmi,
           DZ="DZ", zyg="zyg", id="tvparnr", type="cor", missing=TRUE)
summary(l)
#> ____________________________________________________
#> Group 1
#>                         Estimate Std. Error   Z value Pr(>|z|)
#> Regressions:                                                  
#>    bmi.1~ns(age, 1).1    4.16937    0.16669  25.01334   <1e-12
#>    bmi.1~gendermale.1    1.41160    0.07284  19.37839   <1e-12
#> Intercepts:                                                   
#>    bmi.1                22.53618    0.07296 308.87100   <1e-12
#> Additional Parameters:                                        
#>    log(var)              2.44580    0.01425 171.68256   <1e-12
#>    atanh(rhoMZ)          0.78217    0.02290  34.16186   <1e-12
#> ____________________________________________________
#> Group 2
#>                         Estimate Std. Error   Z value Pr(>|z|)
#> Regressions:                                                  
#>    bmi.1~ns(age, 1).1    4.16937    0.16669  25.01334   <1e-12
#>    bmi.1~gendermale.1    1.41160    0.07284  19.37839   <1e-12
#> Intercepts:                                                   
#>    bmi.1                22.53618    0.07296 308.87100   <1e-12
#> Additional Parameters:                                        
#>    log(var)              2.44580    0.01425 171.68256   <1e-12
#>    atanh(rhoDZ)          0.29924    0.01848  16.19580   <1e-12
#> 
#>                        Estimate 2.5%    97.5%  
#> Correlation within MZ: 0.65395  0.62751 0.67889
#> Correlation within DZ: 0.29061  0.25712 0.32341
#> 
#> 'log Lik.' -29020.12 (df=6)
#> AIC: 58052.24 
#> BIC: 58093.29

Twin analysis, censored outcomes

Twin analysis, binary traits

Time to event

Bibliography

[korkeila_bmi_1991] Korkeila, Kaprio, Rissanen & Koskenvuo, Effects of gender and age on the heritability of body mass index, Int J Obes, 15(10), 647-654 (1991).

[hjelmborg_bmi_2008] Hjelmborg, Fagnani, Silventoinen, McGue, Korkeila, Christensen, Rissanen & Kaprio, Genetic influences on growth traits of BMI: a longitudinal study of adult twins, Obesity (Silver Spring), 16(4), 847-852 (2008).