Data Deluge: Final Stages of Using Qiime for Ion Torrent Data

Once you have the rarefaction curves for within the sample you can now calculate them for between the bar-coded samples. This will give a measure of distance between the samples that can then be used for clustering.

beta_diversity_through_plots.py -i otus/otu_table.biom -m mapping.txt -o wf_bdiv_even4/ -t otus/rep_set.tre -e 4

Where the last term is a cutoff for the minimum number of reads in a sample (must be even) - this is the sampling depth. You need to try a series of cut-offs to find out which works best, but keep it less than the median. If you choose the median then values below the median are ignored (you will lose half of you data but if you are expecting a large number of blank results this might be OK) so only do this if you have lots of skew and the mean is much larger than the median.

This will give a series of 2D and 3D Principle Components plots that can be used to look for any clustering of the samples. Clusters should be visually identifiable and well defined.

Qiime can also carry out a jack-knife analysis to test the clustering dependence on omitting some of the samples. This produces a phylogenetic tree where like samples are on closely related nodes. Clusters are represented by clades (a group of leaves with a common root).

jackknifed_beta_diversity.py -i otus/otu_table.biom -t otus/rep_set.tre -m mapping.txt -o wf_jack50 -e 50

Finally you can make Biplots to show the most important axes between the clusters.

make_3d_plots.py -i wf_bdiv_even4/unweighted_unifrac_pc.txt -m mapping.txt -t wf_taxa_summary/otu_table_L3.txt --n_taxa_keep 5 -o 3d_biplot

The 5 tells the program how many top level taxa to display for comparison to the clusters.

Data Deluge

Friday, 13 September 2013

Final Stages of Using Qiime for Ion Torrent Data

No comments:

Post a Comment