Programming Research Group
Research Report RR-03-21
Finding transcription factor binding sites in DNA Sequences: A template based approach
Sumedha Gunewardena
Peter Jeavons
Revised October 2003, 11pp.
Abstract
A problem faced by many algorithms for finding transcription factor
binding sites is the high number of false positive hits that result with
the increased sensitivity of their prediction. A main contributing factor
to this is the short and degenerate nature of these sites which results in
a low signal to noise ratio. In order to counter this problem one needs to
look beyond the base independence assumption. We propose a model based on
templates designed to capture not only the vertical consensus but also the
correlation of individual bases with the other bases of the site.
This paper is available as a 326,650 bytes PostScript file.
|