Contributed
Papers: Oral Presentations
Genomics and Post-genomics |
HAPPY mapping
– a tool for genome finishing
Alan
T. Bankier, Helen F. Spriggs,
Bernard A. Konfortov, Justin A. Pachebat and Paul
H. Dear
The Laboratory of Molecular Biology, Hills Road, Cambridge
CB2 2QH
ABSTRACT
The majority of small genome
sequences, including those of most parasites, are
assembled using a shotgun strategy. However, shotgun
methods alone are scarcely ever capable of producing
a complete and finished genome sequence: cloning biases,
sequencing problems and repetitive regions leave many
gaps and potential mis-assemblies. For these reasons,
a genome that has only been shotgunned will be left
in a large number of un-connected contigs.
A collection of contigs is sufficient for finding
many of the genes in the organism, but is inadequate
for other purposes. No reliable estimate of the degree
of completeness of the sequence can be made, and many
genes lying at (or beyond) contig ends will have been
missed or mis-predicted. It becomes impossible to
comprehensively catalogue the organism's genes, or
to infer the absence of a specific orthologue simply
from its absence in the contig set. Nor can any conclusions
about genome organisation or long-range synteny be
drawn. In short, a shotgun project tends to produce
a pile of unbound pages rather than a genomic atlas.
Genome finishing (ordering and joining contigs to
produce a complete sequence) is therefore one of the
most challenging but important aspects of sequencing.
It is usually frustrating because the resources used
for finishing – additional sequencing templates
and larger-insert clones – are essentially similar
to those used in the shotgun, and tend to have the
same limitations.
HAPPY mapping is rapid in vitro technique that can
be used to make extremely accurate maps, which allow
shotgun contigs to be located precisely in the genome.
Once this has been done, it becomes far simpler to
close the remaining sequence gaps. Even where gaps
cannot be closed (for example, because a short region
is unsequenceable), the map provides a framework within
which the sizes and locations of the gaps are known.
HAPPY mapping works by the direct analysis of genomic
DNA, and does not involve cloning or the creation
of any prior resources. Its freedom from 'biological'
steps means that it is applicable to any genome, and
is not adversely influenced by 'difficult' sequences,
repetitive regions or other peculiarities of the genome.
The cost and time required to make an accurate HAPPY
map of a genome is normally a fraction of that expended
on shotgun sequencing, and enables rapid completion
of the genome sequence.
The presentation will give details of HAPPY mapping,
and of how it has been applied successfully to a range
of genomes including Dictyostelium, Cryptosporidium,
Eimeria and others.