Bemisia tabaci (Silverleaf whitefly, SSA2 Ng) Assembly and Gene Annotation
Bemisia tabaci Sub-Saharan Africa 2
Whiteflies of the Bemisia tabaci (Gennadius) (Hemiptera: Aleyrodidae) species complex are phloem-feeding insects and plant-virus vectors, some of which are widely regarded to be amongst the world’s worst agricultural pests. Outbreaks of B. tabaci cause significant crop losses and contribute to global food insecurity.
To date, at least six putative cryptic species of B. tabaci have been reported to colonise cassava (Manihot esculenta) in sub-Saharan Africa (SSA) and these were named serially from SSA1 (and its subgroups) to SSA5 [1-5]. Of these, SSA1 (and its subgroups) and SSA2 have been reported as the prevalent whiteflies associated with the spread of viruses that cause devastating cassava mosaic disease (CMD) and cassava brown streak disease (CBSD) pandemics [6-8].
Populations of B. tabaci SSA2 have been recorded from the eastern, southern, central and western areas of Africa as well as in the south of Spain [5,9-11]. In the 1990s, B. tabaci SSA2 was hypothesised to be an invasive species associated with the CMD pandemic in Uganda [8]. More recently, however, B. tabaci SSA2 has rarely been found in Uganda and the superabundant populations have been identified as the species B. tabaci SSA1-SG1 [3, 8, 12,13].
The genome described here was generated from a Nigerian population of B. tabaci SSA2, that was inbred in the laboratory to reduce heterozygosity.
The Bemisia tabaci cryptic species complex
Members of the B. tabaci species complex cause plant damage by feeding on plant-phloem sap, inducing phytotoxic disorders, depositing honeydew on which sooty moulds develop and by vectoring > 300 plant-virus species in the genera Begomovirus, Carlavirus, Crinivirus, Ipomovirus, Polerovirus and Torradovirus [14,15]. Diseases caused by these viruses often spread rapidly with devastating yield losses of up to 100% [6].
Bemisia tabaci sensu lato currently represents a relatively large group (>44) of mostly unresolved cryptic species, as inferred from phylogenetic species delimitation studies [2,16]. These morphologically indistinguishable species differ from one another not only in their genetic relatedness, but also in various biological traits such as plant host-range breadth, fecundity, insecticide resistance, and plant-virus transmission efficiencies.
Bemisia tabaci sensu lato are distributed globally, from tropical to temperate climatic zones and across all continents (except Antarctica) [2]. Most cryptic species in this complex, as currently understood, are geographically restricted, but two of them are highly invasive globally i.e., B. tabaci Middle East-Asia Minor 1 (MEAM1, also referred to as biotype B and Bemisia argentifolii) and B. tabaci Mediterranean (MED, also referred to as biotype Q) [2]. Bemisia tabaci sensu lato live predominantly on herbaceous plant hosts and have been recorded from an exceedingly broad range of host plants (>500 species) [17]. The documented host-plant range of most cryptic species within the complex remains largely incomplete.
Picture credit: Sharon van Brunschot.
Assembly
The Bemisia tabaci SSA2 genome was produced by the genomics consortium of the African Cassava Whitefly Project, funded by the Bill & Melinda Gates Foundation (Grant Number OPP1058938).
A field colony collected by Dr Ibrahim U. Mohammed (Argungu, Kebbi State, Nigeria) was established, maintained and inbred (F6 generation) by Dr Joachim Nwezeobi at the quarantine insectary facilities of the Natural Resources Institute, University of Greenwich, United Kingdom.
High-molecular weight genomic DNA was isolated from a pooled sample of F6 inbred haploid male individuals (n=3000). PacBio Sequel library construction and sequencing (8 SMRT cells) were performed by the Centre for Genomic Research, The University of Liverpool (United Kingdom).
The current assembly of the B. tabaci SSA2 genome was generated by Dr Lahcen Campbell at EMBL-EBI (Hinxton, UK). The length of the B. tabaci SSA2 genome assembly was 625.3 Mb, housed in 785 scaffolds with a scaffold N50 of 7.81Mb. The assembly was produced using Canu v1.8: unitig read coverage 34.5X on genome size estimate (650Mb). The GC content of the assembly was 39.6%. Repeat content covered 46.7% of the genome, predominantly of transposable elements without complete classification. Of the identified transposable elements (TE), DNA type TE were the most widespread at 4.65% total coverage.
Annotation
RNA-Seq data utilized for genome annotation were deposited to the ENA under the accession PRJEB35304, along with publicly available RNA-seq data from three independent short read Illumina PE datasets: SRR1523521 (PRJNA255988); SRR835869 (PRJNA79601); and SRR2001505 (PRJNA282156). Genomic annotation generated with the Ensembl Gene Annotation pipeline [18]. All transcript models are supported by RNA-seq experimental evidence derived from multiple whitefly life-stages. Gene model layering was supported with protein-to-genome alignment of experimentally verified proteins obtained from closely related Hemiptera: Uniprot (2019) and 570 experimentally verified protein genes from the published genome of Bemisia tabaci MEAM1 [19]. The Ensembl Gene Annotation pipeline then implemented transcript consensus filtration to remove unsupported alternate transcript isoform(s).
Small ncRNAs were obtained using a combination of BLAST and Infernal/RNAfold. Pseudogenes were calculated by examining genes with a large percentage of non-biological introns (introns of <10bp), where the gene was covered in repeats, or where the gene was single exon and evidence of a functional multi-exon paralog was found elsewhere in the genome.
lncRNAs were generated via RNA-seq data where no evidence of protein homology or protein domains could be found in the transcript.
For a general in-depth overview of the Gene Annotation pipeline see here: detailed information on the genebuild.
References
- Mugerwa et al. (2020) 'Whole-genome single nucleotide polymorphism and mating compatibility studies reveal the presence of distinct species in sub-Saharan Africa Bemisia tabaci whiteflies'. Insect Science. doi:10.1111/1744-7917.12881.
- De Barro et al. (2011) 'Bemisia tabaci: a statement of species status.' Annual Review of Entomology, 56(1), 1-19. doi: 10.1146/annurev-ento-112408-085504.
- Legg et al. (2014) 'Biology and management of Bemisia whitefly vectors of cassava virus pandemics in Africa'. Pest Management Science, 70(10), 1446-1453. doi:10.1002/ps.3793.
- Berry et al. (2004) 'Molecular evidence for five distinct Bemisia tabaci (Homoptera: Aleyrodidae) geographic haplotypes associated with cassava plants in sub-Saharan Africa'. Annals of the Entomological Society of America, 97(4), 852-859. doi:10.1603/0013-8746(2004)097[0852:MEFFDB]2.0.CO;2.
- Sseruwagi et al. (2006) 'Colonization of non‐cassava plant species by cassava whiteflies (Bemisia tabaci) in Uganda'. Entomologia Experimentalis et Applicata, 119(2), 145-153.doi:10.1111/j.1570-7458.2006.00402.x.
- Colvin et al. (2004) 'Dual begomovirus infections and high Bemisia tabaci populations: two factors driving the spread of a cassava mosaic disease pandemic.' Plant Pathology, 53(5), 577-584. doi: 10.1111/j.0032-0862.2004.01062.x.
- Maruthi et al. (2004) 'Reproductive incompatibility and cytochrome oxidase I gene sequence variability amongst host-adapted and geographically separate Bemisia tabaci populations (Hemiptera: Aleyrodidae)'. Systematic Entomology, 29(4), 560-568. doi:10.1111/j.0307-6970.2004.00272.x.
- Legg et al. (2002) 'A distinct Bemisia tabaci (Gennadius) (Hemiptera: Sternorrhyncha: Aleyrodidae) genotype cluster is associated with the epidemic of severe cassava mosaic virus disease in Uganda'. Molecular Ecology, 11(7), 1219-1229. doi:10.1046/j.1365-294X.2002.01514.x.
- Ally et al. (2019) 'What has changed in the outbreaking populations of the severe crop pest whitefly species in cassava in two decades?' Scientific Reports, 9(1), 1-13. doi:10.1038/s41598-019-50259-0.
- Berry et al. (2004) 'Molecular evidence for five distinct Bemisia tabaci (Homoptera: Aleyrodidae) geographic haplotypes associated with cassava plants in sub-Saharan Africa'. Annals of the Entomological Society of America, 97(4), 852-859. doi:10.1603/0013-8746(2004)097[0852:MEFFDB]2.0.CO;2.
- De la Rúa et al. (2006) 'New insights into the mitochondrial phylogeny of the whitefly Bemisia tabaci (Hemiptera: Aleyrodidae) in the Mediterranean Basin'. Journal of Zoological Systematics and Evolutionary Research, 44(1), 25-33. doi:10.1111/j.1439-0469.2005.00336.x.
- Mugerwa et al. (2012) 'Genetic diversity and geographic distribution of Bemisia tabaci (Gennadius) (Hemiptera: Aleyrodidae) genotypes associated with cassava in East Africa'. Ecology and Evolution, 2(11), 2749-2762. doi:10.1002/ece3.379.
- Wosula et al. (2017) 'Unravelling the genetic diversity among cassava Bemisia tabaci whiteflies using NextRAD sequencing'. Genome Biology and Evolution, 9(11), 2958-2973. doi:10.1093/gbe/evx219.
- Fiallo-Olivé et al. (2020) 'Transmission of begomoviruses and other whitefly-borne viruses: dependence on the vector species'. Phytopathology, 110(1), 10-17. doi:10.1094/phyto-07-19-0273-fi.
- Gilbertson et al. (2015) 'Role of the insect supervectors Bemisia tabaci and Frankliniella occidentalis in the emergence and global spread of plant viruses'. Annual Review of Virology, 2 (1), 67-93. doi:10.1146/annurev-virology-031413-085410.
- Kanakala et al. (2019) 'Global genetic diversity and geographical distribution of Bemisia tabaci and its bacterial endosymbionts'. PLoS ONE, 14(3), e0213946. doi:10.1371/journal.pone.0213946.
Oliveira et al. (2001) 'History, current status, and collaborative research projects for Bemisia tabaci'. Crop Protection, 20(9), 709-723. doi:10.1016/S0261-2194(01)00108-9.
Aken et al. (2016) ‘The Ensembl gene annotation system’. Database, Volume 2016. doi:10.1093/database/baw093.
- Chen et al. (2016) 'The draft genome of whitefly Bemisia tabaci MEAM1, a global crop pest, provides novel insights into virus transmission, host adaptation, and insecticide resistance'. BMC Biology 14, 110. doi:10.1186/s12915-016-0321-y.
More information
General information about this species can be found in Wikipedia
Statistics
Summary
Assembly | SSA2_Nigeria_n785_625mb, INSDC Assembly GCA_903994125.1, |
Database version | 113.1 |
Golden Path Length | 625,325,389 |
Genebuild by | Ensembl |
Genebuild method | Import |
Data source | Ensembl Metazoa |
Gene counts
Coding genes | 12,928 |
Non coding genes | 1,350 |
Small non coding genes | 172 |
Long non coding genes | 1,175 |
Misc non coding genes | 3 |
Pseudogenes | 108 |
Gene transcripts | 29,035 |