GenomeThreader Gene Prediction Software
GenomeThreader is a software tool to compute gene structure predictions.
The gene structure predictions are calculated using a similarity-based approach
where additional cDNA/EST and/or protein sequences are used to predict gene
structures via spliced alignments.
GenomeThreader was motivated by disabling limitations in
GeneSeqer, a popular gene prediction program which is widely used
for plant genome annotation.
Features
-
Intron Cutout Technique:
The intron cutout technique allows to overcome the time and space
limitations of the dynamic programming (DP) algorithms used in
GeneSeqer,
in particular, when applied to organisms containing long introns.
-
Baysian Splice Site Models (BSSMs):
With BSSMs it is possible to assign probabilities to GT donor, GC donor,
and AG acceptor sites. This information is used in the DP to get the exact
exon/intron boundaries right.
-
Combination of cDNA/EST Based Spliced Alignments with Protein Based Spliced
Alignments:
After (spliced) aligning the supplied cDNAs/ESTs and protein sequences onto
the genomic template, GenomeThreader computes consensus spliced
alignments. Consensus spliced alignments combine several spliced alignments
to resolve the complete gene structure and to uncover alternative splicing.
-
Incremental Updates:
When the used cDNA/EST or protein database is updated, a common approach
was to redo the complete mapping. With GenomeThreader, you can combine
newly computed spliced alignments with precomputed spliced alignments to
quickly recompute consensus spliced alignments.
-
XML:
The additional GenomeThreader XML output conforms to our gthXML
standard GenomeThreader.rng.txt. With
the included script XML2GFF.py, it is possible to convert gthXML output to the
GFF format.
A variety of gthXML-specific tools can be found
here.
-
gthDB:
We also provide
a schema and load script for gthDB, which permits storage
and query of GenomeThreader output in a relational format.
References have been omitted for brevity; you can find them and more details on
the implementation in the GenomeThreader
paper.
How to take advantage of these features and many more is described in depth in
the GenomeThreader manual.
Please consult the FAQ page for frequently asked
questions.
All mentioned files and scripts are also part of the GenomeThreader
distribution (see below).
Availability
GenomeThreader is available free of charge for non-commercial research
institutions. To obtain a copy, please fill out the
license agreement and send it to us by fax or
email (as described in the document). As soon as we receive your license agreement, we send you a username and password by email which you can use to
download GenomeThreader.
Examples
-
Evaluation cases described in Gremme et
al. 2005 (see below)
-
A 16.6Kb rice gene structure tractable with GenomeThreader (using
both an intron cutout technique
and without), but beyond
GeneSeqer's limitations.
-
A 125Kb intron-containing human
gene structure.
-
Small samples of gzip'ed
plain text and
XML
GenomeThreader output.
Users
The following sites use GenomeThreader. This list is not intended to be
comprehensive.
If you want to appear on this list, please drop me a
note.
Citations
GenomeThreader has been cited in the following publications:
-
J.M. Cock et al.
The Ectocarpus genome and the independent evolution of multicellularity in brown algae. Nature,
465:617-621, 2010 (in Supplementary Information)
-
R. Wang et al.
PEP1 regulates
perennial flowering in Arabis alpina. Nature,
459:423-427, 2009
-
M. Calviño, R. Bruggmann and J. Messing.
Screen of
genes linked to high-sugar content in stems by comparative genomics.
Rice, 1(2):166-176, 2008
-
M.E. Sparks and V. Brendel.
MetWAMer: eukaryotic translation initiation site prediction.
BMC Bioinformatics, 9:381, 2008
-
P. Abad et al.
Genome
sequence of the metazoan plant-parasitic nematode Meloidogyne
incognita. Nature Biotechnology, 26:909-915, 2008
-
Q. Dong, M.D. Wilkerson and V. Brendel.
Tracembler - software for in-silico chromosome walking in
unassembled genomes. BMC Bioinformatics, 8:151, 2007
-
A. Ballvora et al.
Comparative sequence analysis of Solanum and Arabidopsis in a hot spot for
pathogen resistance on potato chromosome V reveals a patchwork of conserved
and rapidly evolving genome segments.
BMC Genomics, 8:112, 2007
-
R. Bruggmann et al.
Uneven chromosome contraction and expansion in the maize genome.
Genome Research, 16:1241-1251, 2006
If I missed a publication which cites GenomeThreader, please drop me a
note.
Developers
GenomeThreader is being actively developed by the following individuals:
Publications
Please cite the following article in publications about research using
GenomeThreader: