Modelling and Removal of "Reflective" Phenomena in Electrophoresis Sequencing Data

C. Domni┼čoru (USA)


Base Calling, DNA sequencing, Cross-talk filtering, Algorithm, Base spacing model.


An important aspect in data processing for the four dye fluorescence-based DNA sequencing is the base calling. The existing techniques are based on ABI filtered data or are using a similar succession of preprocessing steps. In this paper we present a new processing step aiming at improving the overall accuracy of the data. We observed that in some data files, each peak is preceded by a smaller, similar peak and followed by an even smaller peak formation. This paper is presenting this effect and the approach proposed to compensate for it. The new processing step has been incorporated in our base calling algorithm which is oriented toward preserving the information contained in the raw data and avoiding the use of traditional filtering techniques. A comparison with ABI base calling results and with phred is also presented showing promising results.

