Another possibility is using a pipeline installed on a local machine. The weakness of this approach is that you need to keep the r-RNA databases up-to-date to make sure you get the best results. Possible locally installed pipelines are Pangaea and Qiime (pronounced chime).
An initial challenge can be that the files you get from Next Generation sequencing might not be compatible with the pipelines either because they have a different quality measure to that expected, or because the pipelines cannot work with fastq files.
Transforming Fastq Files
One tool for transforming fastq files is the FASTX-Toolkit. Another simpler and faster method but with much less functionality is to use BioPython (you need to install it first but that is relatively simple on Linux systems like Ubuntu). For the FASTX toolkit if you are using Ion Torrent data do not forget to include the -Q 33 flag to show that a quality string different to the default is being used.
The BioPython code for creating the fasta files from the fastq files is:
SeqIO.convert("filename.fastq", "fastq", "output.fasta", "fasta")
To create the accompanying quality file the command is:
SeqIO.convert("filename.fastq", "fastq", "output.qual", "qual")
No comments:
Post a Comment