Whole genome alignment

Pairwise whole genome alignments can be used to determine conservation or differences between pairs of species, or to match up regions between species, so as to study the same genomic region in multiple species.

LastZ [1] is used to align the genome sequences at the DNA level.

Genomes are compared to one another, for comparison between species, and to themselves, to identify paralogous regions. In self-alignment analyses, the trivial alignments (i.e. regions against themselves) are removed, such that only the paralogous regions remain.

The actual whole-genome alignments are the results of post-processing the raw LastZ results. In the first step, original blocks are chained according to their location in both genomes. The netting process chooses for the reference species the best sub-chain in each region [2].

The resultant LastZ-net alignments are displayed in Ensembl Genomes for selected fungal, metazoan, protist and plant species.

These alignments are used to calculate synteny and for scoring orthologue quality.

References

  1. Improved pairwise alignment of genomic DNA. Harris RS. 2007. Ph.D. Thesis, The Pennsylvania State University.
  2. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Kent WJ et al. 2003. PNAS. 100(20):11484-9.