Data Deluge: Normalisation - Detailed or Quick and Dirty

Tuesday, 19 October 2010

Normalisation - Detailed or Quick and Dirty

There is always the question of what method is the best for normalisation. There is no firm rule and different reviews have suggested different preferences. For Affymetrix arrays Mas 5.0 is definitely not the best normalisation technique.

The problem is that while there are hundreds of different ways of normalising data in the end the proof comes at the end of the data analysis, when you find interesting biology and so it is probably best to run a quick and simple normalisation such as rma and then to focus on the later stages of the analysis.

In this case I have two sets of Affymetrix data raw and raw1 and I processed them with rma, gcrma and farms to get six final normalised datasets.

library("gcrma")
library("farms")
LCrma <- rma(raw)
LCrma1 <- rma(raw1)
LCgcrma1 <- gcrma(raw1)
LCgcrma <- gcrma(raw)
LCfarms <- q.farms(raw)
LCfarms1 <- q.farms(raw1)

You can then look at the normalised histograms of the data to see how well normalisation has performed.

eLCfarms1<-exprs(LCfarms1)
hist(eLCfarms1)
boxplot(eLCfarms1, outline = FALSE, col="lightblue")

Data Deluge

Tuesday, 19 October 2010

Normalisation - Detailed or Quick and Dirty

No comments:

Post a Comment