Anopheles coluzzii (Mosquito, Mali-NIH) Assembly and Gene Annotation
About Anopheles coluzzii
Anopheles coluzzii, formerly known as Anopheles gambiae M molecular form, was defined as a separate species in 2013 (Coetzee et al.). An. coluzzii belongs to the Anopheles gambiae species complex, which consists of at least seven species.
Maureen Coetzee, Richard H. Hunt, Richard Wilkerson, Alessandra Della Torre, Mamadou B. Coulibaly & Nora J. Besansky. 2013. Anopheles coluzzii and Anopheles amharicus, new members of the Anopheles gambiaecomplex. Zootaxa 3619 (3): 246--274.
Mali-NIH strain
The Anopheles gambiae M Mali-NIH colony was established from blood-fed adult females collected resting inside houses in Niono, Mali in November 2005. Approximately 80 isofemale families molecularly identified as An. gambiae M form were used to establish the colony, which was subsequently karyotyped as 2Rbc/bc; 2La/a. The A. gambiae M Mali-NIH colony is currently maintained from the BEI resources web portal.
Source: VectorBase
AcolM1 assembly
An. coluzzii Mali-NIH, formerly Anopheles gambiae M form, was sequenced by the Washington University School of Medicine Genome Sequencing Center (Lawniczak et al. 2010, PMID: 20966253). DNA samples derived from whole mosquitoes were provided by the University of Notre Dame and MR4. BAC libraries were provided by the Clemson University Genomics Institute (CUGI), and are available through CUGI or MR4. The number of traces per genome was ~2.755 million M sequence reads deposited in the NCBI Trace Archives. For the M project, 94% of the traces were from plasmids (5-6kb inserts), 4% from fosmids (40kb inserts), and 2% from BACs. Based on the source DNA of these libraries, 94% of reads were generated from males, resulting in lower X-chromosome sequence coverage relative to autosomes.
Whole genome shotgun (WGS) sequences were assembled de novo for each genome by both sequencing centers: at WUGSC using the PCAP assembler, and at JCVI using the Celera assembler (http://wgs-assembler.sf.net). WUGSC assemblies based on the original PCAP algorithm were nearly twice the expected ~260 Mb size. This outcome reflected considerable numbers of high quality base discrepancies (polymorphisms), owing to relatively high allelic variation in the non-isogenic genome samples. Although a modification of PCAP (Pcap.rep.poly) resulted in smaller assembled genome sizes, the Celera assembler algorithms specifically developed to accommodate heterozygosity gave improved assemblies. By mutual agreement, the JCVI assemblies (available via GenBank accessions ABKP00000000 and ABKQ00000000) and served as the basis for the VectorBase genome browsers.
AcolM1.8 gene set
Community annotation patch build for July 2019
References
- Anopheles coluzzii and Anopheles amharicus, new members of the
Anopheles gambiae
complex.
Coetzee M, Hunt RH, Wilkerson R, Della Torre A, Coulibaly MB, Besansky NJ. 2013. Zootaxa. 3619
Picture credit: VectorBase.org
Statistics
Summary
Assembly | AcolM1, INSDC Assembly GCA_000150765.1, Apr 2008 |
Database version | 113.1 |
Golden Path Length | 224,417,174 |
Genebuild by | VEuPathDB |
Genebuild method | Import |
Data source | VectorBase |
Gene counts
Coding genes | 14,465 |
Non coding genes | 331 |
Small non coding genes | 329 |
Long non coding genes | 2 |
Gene transcripts | 14,842 |
Other
Genscan gene predictions | 33,966 |