Published January 1, 2020
| Version v1
Journal article
Open
Apollo: a sequencing-technology-independent, scalable and accurate assembly polishing algorithm
Creators
- 1. Swiss Fed Inst Technol, Dept Comp Sci, CH-8092 Zurich, Switzerland
- 2. Carnegie Mellon Univ, Dept Elect & Comp Engn, Pittsburgh, PA 15213 USA
- 3. Bilkent Univ, Dept Comp Engn, TR-06800 Ankara, Turkey
Description
Motivation: Third-generation sequencing technologies can sequence long reads that contain as many as 2 million base pairs. These long reads are used to construct an assembly (i.e. the subject's genome), which is further used in downstream genome analysis. Unfortunately, third-generation sequencing technologies have high sequencing error rates and a large proportion of base pairs in these long reads is incorrectly identified. These errors propagate to the assembly and affect the accuracy of genome analysis. Assembly polishing algorithms minimize such error propagation by polishing or fixing errors in the assembly by using information from alignments between reads and the assembly (i.e. read-to-assembly alignment information). However, current assembly polishing algorithms can only polish an assembly using reads from either a certain sequencing technology or a small assembly. Such technology-dependency and assembly-size dependency require researchers to (i) run multiple polishing algorithms and (ii) use small chunks of a large genome to use all available readsets and polish large genomes, respectively.
Files
bib-5ed84abb-36e9-45e3-8f83-d2e799270d97.txt
Files
(215 Bytes)
| Name | Size | Download all |
|---|---|---|
|
md5:5d5189a4da9280808a77672d32751c93
|
215 Bytes | Preview Download |