Sunday 24 October 2010

Linear Models and Testing

Perhaps the best reason for using R for building complex statistical models is the strength of the linear model packages, which make building complex models reasonably straight forward, even if you have multiple groups. This is important following cluster analysis where you are unlikely to only end up with two clusters.

The package for constructing linear models from microarray data is Limma.

The following code constructs a model for three different clusters (the 0 cluster is used to label all the arrays that did not fall into one of the three identified clusters). First a design matrix is constructed which will define that comparisons will be carried out. The basic model is then fitted, then a second model with more complex interactions is designed using the contrast matrix. This model is then fitted using empirical Bayes (this is most effective at small sample size).

library(limma)
f <- factor(f2LCgcrma1$Cluster, levels =c("0", "1", "2", "3"))
design <-model.matrix(~0 + f)
colnames(design) <- c("cl0", "cl1", "cl2", "cl3")
fitgcrma <- lmFit(f2LCgcrma1, design)
contrast.matrix <- makeContrasts(cl3-cl2, cl3-cl1, cl2-cl1, levels=design)
fit2gcrma <- contrasts.fit(fitgcrma, contrast.matrix)
fit2gcrma <- eBayes(fit2gcrma)
tab1gcrma<-topTable(fit2gcrma, coef=1, adjust="BH")
tab2gcrma<-topTable(fit2gcrma, coef=2, adjust="BH")
tab3gcrma<-topTable(fit2gcrma, coef=3, adjust="BH")

No comments:

Post a Comment