New preprint: Compuational gene prediction with genomic tiling microarray data

The genome of a higher organism is a complex entity. It is not merely comprised of the genes it encodes, but also of many other contributing elements. Elucidating the function of these elements is a non-trivial task, which lends itself well to computational methods. Here we combine two methods of identifying these functional elements: computational gene prediction and transcription mapping with tiling microarrays. In order to do so a generalised hidden Markov model (GHMM) ab initio gene predictor is developed, which is shown to perform comparably to other ab initio GHMM predictors. We then incorporate a transcription mapping statistic based on correlations, into a GHMM gene model. This model can predict both protein-coding genes and non protein-coding gene fragments based on tiling array expression data and genomic sequence data, thus accommodating a broader and more realistic view of molecular biology.

Presented at the Genome Informatics Workshops in Brisbane, QLD, 2008