Anopheles atroparvus (EBRE) - GCA_914969975.1 (atroparvus_hifiasm_n225_277Mb)

Anopheles atroparvus (EBRE) - GCA_914969975.1 Assembly and Gene Annotation

About Anopheles atroparvus


Anopheles atroparvus belongs to the A. maculipennis species complex. Current suitability studies indicate that habitat and climate in 21st century Europe are extensively appropriate for A. atroparvus, being distributed in northern and western Europe, Spain (including the Canary Islands), Portugal and northern Italy and was one of the main malaria vectors in Europe. However the speceis has known ranges covering: Belgium, Bosnia and Herzegovina, Bulgaria, Croatia, Czech Republic, Denmark, Estonia, France, Hungary, Latvia, Lithuania, Macedonia, Moldova, Montenegro, Poland, Romania, Russia, Serbia & Slovakia, Slovenia, Sweden and Ukraine. Anopheles atroparvus was the most abundant species of the A. maculipennis complex found in south-east England [5]. Flight range of A. atroparvus is suggested to be at least three kilometres [1]. A. atroparvus hibernate as adult females, often seeking shelter indoor in stables or man-made dwellings during the autumn where they can remain active [1].

Vectoral capacity

Although research interest in A. atroparvus has been low in the past several decades, recent concern for an increase in vector-borne disease has encouraged new research into this species. The current dominant Anopheles vector species in Europe and the Mediterranean includes A. atroparvus, among other species of Anopheles[4]. Anopheles atroparvus, the dominant vector in large parts of Europe, might play an important role with respect to changes of the potential transmission stability[6]. Anopheles atroparvus is generally considered as primarily zoophilic, however, it has also been described as anthropophilic, with host choice dependednt on availability thus reflecting the opportunistic nature of this species. Implicated in the transmission of autochhtonous malaria in Europe via the human malaria parasite Plasmodium vivax; and known to be involved in winter transmission of malaria at the start of the twentieth century in Britain, coastal areas in the Netherlands and Germany[2] and elsewhere in Europe[1]. In Portugal, Anopheles atroparvus is the only mosquito species implicated in malaria transmission[3].

More information

Anopheles atroparvus strain Infravec2 Ebre Delta preserved or extract

General information about this species can be found here Wikipedia and ECDC

Picture credit: Creative Commons Attribution 4.0 via Wikimedia Commons (Image source)


The genome assembly presented here is linked to the assembly accession [GCA_914969975.1]. This genome assembly was produced as part of the Infravec2 project to study A. atroparvus. This material sample is from A. atroparvus strain 'EBRE', made available as biological resource for distribution via the Infravec product catalog ( The assembly produced from a single PacBio SMRT cell (CLR) HIFI library (ERX6138161). Assembly was generated de-novo on a set of taxonomically pre-filtered genomic reads (Kraken2 [9]), assembled with HIFIasm v0.15 [10] and finally examined with Blobtools [11] to identify and remove potential non Anopheles contaminant scaffolds.

Redudant/partial mitochondrial scaffolds were removed, retaining the single longest which was rescaffolded and fully annotated with MitoFinder [12]. The finalised assembly is composed of 225 scaffolds, 227.7Mb (8.9% scaffolds > 1Mb; 2.7% scaffolds > 10Mb) with a scaffold N50 of 44.9Mb and an L50 of 3. Draft quality and performance assessed with comparison to a previous Infravec reference assembly for A. atroparvus (AatrE3); which itself was generated via rescaffolding of a short read Illumina assembly (AatrE1)[13]. High concordance of chromosomal sequecne overlap was observed between AatrE3 and this reference assembly.


RNA-Seq data utilized for genome annotation were obtained from publically available RNA-seq SRR826830 (PRJNA196857 - "RNA sequencing of 15 genomes of Anopheles") generated from Illumina PE sequencing of cDNA (HiSeq 2000). Genomic annotation was generated with the Ensembl gene annotation pipeline [7]. Transcript models are supported by RNA-seq experimental evidence. Gene model layering was supported with protein-to-genome alignment of previously generated protein models from A. atroparvus assembly version GCA_000473505.1 along with experimentally verified proteins obtained from closely related Hexapoda species - (Uniprot, April 2021). The Ensembl Gene Annotation pipeline implemented transcript consensus filtration to remove unsupported alternate transcript isoforms.

Small ncRNAs were obtained using a combination of BLAST and Infernal/RNAfold[8]. Pseudogenes were calculated by examining genes with a large percentage of non-biological introns (introns of <10bp), where the gene was covered in repeats, or where the gene was single exon and evidence of a functional multi-exon paralog was found elsewhere in the genome.

lncRNAs were generated via RNA-seq data where no evidence of protein homology or protein domains could be found in the transcript.

For in-depth overview of the Gene Annotation pipeline see detailed information here.


  1. Becker et al. Mosquitoes and their control. Second Edition ed. Berlin: Springer Verlag; (2010).
  2. Takken et al.. Distribution and dynamics of larval populations of Anopheles messeae and A. atroparvus in the delta of the rivers Rhine and Meuse, The Netherlands. Ambio. 31(3):212-8 (2002)
  3. Cambournac, FJ. Sobre a epidemiologia do sezonismo em Portugal. Lisbon (1942).
  4. Sinka et al. The dominant Anopheles vectors of human malaria in Africa, Europe and the Middle East: occurrence data, distribution maps and bionomic précis. Parasit Vectors. 3(117), (2010)
  5. Danabalan et al.. Occurrence and host preferences of Anopheles maculipennis group mosquitoes in England and Wales. Med Vet Entomol. (2013)
  6. Hertig, E. Distribution of Anopheles vectors and potential malaria transmission stability in Europe and the Mediterranean area under future climate change. Parasites Vectors 12(18), (2019).
  7. Aken et al. ‘The Ensembl gene annotation system’. Database, Volume 2016. (2016)
  8. Eddy SR. A memory-efficient dynamic programming algorithm for optimal alignment of a sequence to an RNA secondary structure. BMC Bioinformatics. 3(18), (2002).
  9. Wood, D.E., Lu, J. and Langmead, B. Improved metagenomic analysis with Kraken 2. Genome biology, 20(1), (2019)
  10. Cheng, Haoyu, et al. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nature methods 18(2), (2021)
  11. Laetsch, Dominik R., and Mark L. Blaxter. "BlobTools: Interrogation of genome assemblies." F1000Research 6.1287: 1287, (2017).
  12. Allio R, Schomaker-Bastos A, Romiguier J, Prosdocimi F, Nabholz B, Delsuc F. MitoFinder: Efficient automated large-scale extraction of mitogenomic data in target enrichment phylogenomics. Mol Ecol Resour. 20(4), (2020)
  13. Artemov et al. Partial-arm translocations in evolution of malaria mosquitoes revealed by high-coverage physical mapping of the Anopheles atroparvus genome. BMC Genomics 19, 278, (2018).



Assemblyatroparvus_hifiasm_n225_277Mb, INSDC Assembly GCA_914969975.1,
Database version108.1
Golden Path Length277,763,289
Genebuild byEnsembl
Genebuild methodImport
Data sourceEnsembl Metazoa

Gene counts

Coding genes12,610
Non coding genes1,214
Small non coding genes1,191
Long non coding genes20
Misc non coding genes3
Gene transcripts18,001