Trim and Filter reads
Data received from sequencing facilities might still contain sequencing artefacts and would therefore need to be removed or reads need to be filtered. For example at the 3’ end of reads there are often adapter sequences left from library preparation. These adaptor bases need to be removed, and low quality bases need to be trimmed off. Any of the indicated programs can be used for this. Bokulich et al. 2013 1 recommend a minimum phred quality score of 3 to trim low quality bases at the ends of the reads. Jeraldo et al. 2014 (add link to the published paper) recommend trimming the 3’ end of the reads with a moving average score of 15, with a window size of 4 bases and removal of any reads shorter than 75% of the original read length. It is also recommended that reads containing ambiguous bases (N) be discarded.
Software: Trimmomatic, PRINSEQ, SolexaQA
Bibliography
-
Bokulich, Nicholas A., et al. “Quality-filtering vastly improves diversity estimates from Illumina amplicon sequencing.” Nature methods 10.1 (2013): 57. ↩
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.