9 Model testing

You might be interested in comparing different models, which is a common approach to modelisation in biology. However, there is a slight twist that you need to be aware of with PGLS.

The default method for model fitting with gls is restricted maximum likelihood estimation (REML), obtained by method="REML". This is different than standard maximum likelihood estimation (ML), which can be obtained with method="ML". The difference between these is complex, but suffice to say that they differ in the way the variance parameters are estimated. REML provides less biased parameter estimates and is the preferred method to report the parameter coefficients in a publication. It is also the method of choice if you want to compare models with different correlation (or variance) structures (Zuur et al. 2009). For example, if you want to test whether a PGLS model with an optimized Pagel’s \(\lambda\) fits the data better than a model with no phylogenetic correlation (that is, with Pagel \(\lambda=0\)):

pagel.0 <- gls(Shade ~ Wd, data = seedplantsdata, 
               correlation=corPagel(0,phy=seedplantstree, 
                                    fixed=TRUE, form=~Code), 
               method="REML")
pagel.fit <- gls(Shade ~ Wd, data = seedplantsdata, 
                 correlation=corPagel(0.8,phy=seedplantstree, 
                                      fixed=FALSE, form=~Code),
                 method="REML")
anova(pagel.0,pagel.fit)
##           Model df      AIC     BIC    logLik   Test  L.Ratio p-value
## pagel.0       1  3 180.4720 186.494 -87.23602                        
## pagel.fit     2  4 163.3967 171.426 -77.69833 1 vs 2 19.07537  <.0001

You can use the AIC or BIC to compare the model, or the likelihood ratio test. You can see here that the PGLS model with a fitted Pagel \(\lambda\) has a better fit than the one with a \(\lambda=0\) (smaller AIC). By the way, this is also a test of whether a PGLS model is better than a standard regression model as a corPagel structure with \(\lambda=0\) is a standard model (= no phylogenetic correlation). In this case, the model with the corPagel structure has a clearly better fit than the model without a phylogenetic structure.

Now, if you are interested in testing the fixed parameters in the model, you need to use maximum likelihood fitting (Zuur et al. 2009). For instance, if you want to use a likelihood ratio test to test the model with wood density as independent variable versus a null model with just the intercept, you can do the following.

wd <- gls(Shade ~ Wd, data = seedplantsdata,
          correlation=corBrownian(phy=seedplantstree, form=~Code), 
          method="ML")
null <- gls(Shade ~ 1, data = seedplantsdata,
            correlation=corBrownian(phy=seedplantstree, form=~Code), 
            method="ML")
anova(wd,null)
##      Model df      AIC      BIC    logLik   Test  L.Ratio p-value
## wd       1  3 222.0088 228.1380 -108.0044                        
## null     2  2 226.4988 230.5848 -111.2494 1 vs 2 6.489907  0.0108

You can see the model with the wood density variable is better than the model with only the intercept. However, as mentioned above, because the REML fitting provides better parameter estimates, you would have to refit the model using REML to present the results.

wd.final <- gls(Shade ~ Wd, data = seedplantsdata,
                correlation=corBrownian(phy=seedplantstree, form=~Code), 
                method="REML")
summary(wd.final)
## Generalized least squares fit by REML
##   Model: Shade ~ Wd 
##   Data: seedplantsdata 
##        AIC      BIC    logLik
##   214.3762 220.3982 -104.1881
## 
## Correlation Structure: corBrownian
##  Formula: ~Code 
##  Parameter estimate(s):
## numeric(0)
## 
## Coefficients:
##                Value Std.Error   t-value p-value
## (Intercept) 0.911433  4.409058 0.2067184  0.8370
## Wd          4.361028  1.693349 2.5753865  0.0127
## 
##  Correlation: 
##    (Intr)
## Wd -0.166
## 
## Standardized residuals:
##         Min          Q1         Med          Q3         Max 
## -0.26890642 -0.16431866 -0.02645422  0.09638984  0.34953444 
## 
## Residual standard error: 7.455109 
## Degrees of freedom: 57 total; 55 residual

References

Zuur, Alain F, Elena N Ieno, Neil J Walker, Anatoly A Saveliev, Graham M Smith, et al. 2009. Mixed Effects Models and Extensions in Ecology with r. Vol. 574. Springer.