Anopheles coluzzii (Mosquito, Mali-NIH) (AcolM1)

Anopheles coluzzii (Mosquito, Mali-NIH) Assembly and Gene Annotation

The Anopheles coluzzii data and its display on Ensembl Genomes are made possible through a joint effort by the Ensembl Genomes group and VectorBase, a component of VEuPathDB.

The assembly name may not match that from INSDC due to additional community contributions applied by VEuPathDB to the initial INSDC assembly (recorded by the assembly accession).

About Anopheles coluzzii

Anopheles coluzzii, formerly known as Anopheles gambiae M molecular form, was defined as a separate species in 2013 (Coetzee et al.). An. coluzzii belongs to the Anopheles gambiae species complex, which consists of at least seven species.

Maureen Coetzee, Richard H. Hunt, Richard Wilkerson, Alessandra Della Torre, Mamadou B. Coulibaly & Nora J. Besansky. 2013. Anopheles coluzzii and Anopheles amharicus, new members of the Anopheles gambiaecomplex. Zootaxa 3619 (3): 246--274.

Mali-NIH strain

The Anopheles gambiae M Mali-NIH colony was established from blood-fed adult females collected resting inside houses in Niono, Mali in November 2005. Approximately 80 isofemale families molecularly identified as An. gambiae M form were used to establish the colony, which was subsequently karyotyped as 2Rbc/bc; 2La/a. The A. gambiae M Mali-NIH colony is currently maintained from the BEI resources web portal.

Source: VectorBase

AcolM1 assembly

An. coluzzii Mali-NIH, formerly Anopheles gambiae M form, was sequenced by the Washington University School of Medicine Genome Sequencing Center (Lawniczak et al. 2010, PMID: 20966253). DNA samples derived from whole mosquitoes were provided by the University of Notre Dame and MR4. BAC libraries were provided by the Clemson University Genomics Institute (CUGI), and are available through CUGI or MR4. The number of traces per genome was ~2.755 million M sequence reads deposited in the NCBI Trace Archives. For the M project, 94% of the traces were from plasmids (5-6kb inserts), 4% from fosmids (40kb inserts), and 2% from BACs. Based on the source DNA of these libraries, 94% of reads were generated from males, resulting in lower X-chromosome sequence coverage relative to autosomes.

Whole genome shotgun (WGS) sequences were assembled de novo for each genome by both sequencing centers: at WUGSC using the PCAP assembler, and at JCVI using the Celera assembler (http://wgs-assembler.sf.net). WUGSC assemblies based on the original PCAP algorithm were nearly twice the expected ~260 Mb size. This outcome reflected considerable numbers of high quality base discrepancies (polymorphisms), owing to relatively high allelic variation in the non-isogenic genome samples. Although a modification of PCAP (Pcap.rep.poly) resulted in smaller assembled genome sizes, the Celera assembler algorithms specifically developed to accommodate heterozygosity gave improved assemblies. By mutual agreement, the JCVI assemblies (available via GenBank accessions ABKP00000000 and ABKQ00000000) and served as the basis for the VectorBase genome browsers.

AcolM1.8 gene set

Community annotation patch build for July 2019

References

  1. Anopheles coluzzii and Anopheles amharicus, new members of the Anopheles gambiae complex.
    Coetzee M, Hunt RH, Wilkerson R, Della Torre A, Coulibaly MB, Besansky NJ. 2013. Zootaxa. 3619

Picture credit: VectorBase.org

Statistics

Summary

AssemblyAcolM1, INSDC Assembly GCA_000150765.1, Apr 2008
Database version113.1
Golden Path Length224,417,174
Genebuild byVEuPathDB
Genebuild methodImport
Data sourceVectorBase

Gene counts

Coding genes14,465
Non coding genes331
Small non coding genes329
Long non coding genes2
Gene transcripts14,842

Other

Genscan gene predictions33,966