Aedes aegypti Assembly and Gene Annotation
About Aedes aegypti
Aedes aegypti exists in at least two forms (considered either subspecies or separate species according to different authors), namely Ae. aegypti formosus (the original wild type found in Africa) and Ae. aegypti aegypti (the worldwide urban form). The yellow fever mosquito, Ae. aegypti aegypti, has a worldwide distribution in the tropics and subtropics where it is the main vector of both dengue and yellow fever viruses.
Picture credit (public domain): James Gathany (CDC) 2006
The Aedes aegypti Liverpool LVP strain genome sequence is a joint effort between the Broad Institute and The Institute for Genomic Research (TIGR). Assembly of 8x shotgun coverage was performed using the Broad's whole genome assembly package ARACHNE. The assembly presented here (AaegL1, August 2005) consists of 4,758 supercontigs, totalling 1.3 gigabases, with a contig N50 of 82 Kb and supercontig N50 size of 1,500 Kb. (The supercontig "supercont1.3491" is not in the INSDC record of the assembly, but is displayed in Ensembl Metazoa because genes have been annotated in this region.)
The initial annotation of the Aedes aegypti genome is a collaboration between VectorBase and TIGR. Each group generated a set of gene predictions which were merged into a single canonical set (AaegL1.1). The geneset presented here (AaegL1.4, June 2013) represents the original set with 3 rounds of integration of community annotations and inclusion of additional gene predictions which were excluded from the AaegL1.1 set but subsequently found to have supporting evidence (transcriptome, mapped protein domains).
EST and Protein Alignments
WU-BlastX was used to map UniProtKB proteins onto the Aedes aegypti genome. The datasets used were: Aedes, mosquito, drosophilid, arthropod, metazoan, eukaryotic, and non-eukaryotic proteins. The wider taxonomic groups exclude any of the more specific groups, e.g. the arthropod dataset excludes mosquito and drosophilid proteins. (Example: supercont1.174:147000-580000).
Approximately 1300 community annotations are mapped to the genome (Example: supercont1.174:147000-580000).
- Genome sequence of Aedes aegypti, a major arbovirus vector.
Nene V, Wortman JR, Lawson D, Haas B, Kodira C, Tu ZJ, Loftus B, Xi Z, Megy K, Grabherr M et al. 2007. Science. 316:1718-1723.
|Assembly:||AaegL1, INSDC Assembly GCA_000004015.1, Oct 2005|
|Golden Path Length:||1,383,971,543|
|Genebuild method:||Full genebuild|
|Short non coding genes:||1,358|