This model is very important as it has been widely applied to high throughput techniques, not only in the case of microarrays and transcriptomics but also in the case of proteomics. The most cited paper is:
Bolstad BM, Irizarry RA, Astrand M, and Speed T. (2003) A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19(2) 185-193.
This paper describes quantile normalisation and uses spike in data to evaluate the effects of the normalisation process. If normalisation has not affected the data in any way then the slope of the graph for spiked in concentration vs level of expression should have a gradient of 1. The linear model that is fitted is:
log2(E)= β0 + β1 log2(c) + ε
Note that this is a logarithmic linear model and so the errors in the untransformed data are not additive but multiplicative.
Bolstad BM, Irizarry RA, Astrand M, and Speed T. (2003) A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19(2) 185-193.
This paper describes quantile normalisation and uses spike in data to evaluate the effects of the normalisation process. If normalisation has not affected the data in any way then the slope of the graph for spiked in concentration vs level of expression should have a gradient of 1. The linear model that is fitted is:
log2(E)= β0 + β1 log2(c) + ε
Note that this is a logarithmic linear model and so the errors in the untransformed data are not additive but multiplicative.
No comments:
Post a Comment