Bemisia tabaci (Uganda 1) (Uganda1_n5713_630Mb)

Bemisia tabaci (Uganda 1) Assembly and Gene Annotation

Bemisia tabaci Uganda 1

Whiteflies of the Bemisia tabaci (Gennadius) (Hemiptera: Aleyrodidae) species complex are phloem-feeding insects and plant-virus vectors, some of which are widely regarded to be amongst the world’s worst agricultural pests. Outbreaks of B. tabaci cause significant crop losses and contribute to global food insecurity.

The whitefly Bemisia tabaci Uganda 1, also referred to as Bemisia Uganda1, B. tabaci Uganda, B. tabaci Uganda sweet potato and B. tabaci Uganda 8, has been reported from a range of vegetable and weed hosts in Uganda [1-3]. It has not been reported from cassava or in association with cassava disease epidemics [1-4]. The plant hosts this whitefly has been reported from include: Ipomoea batatas (sweetpotato), Solanum lycopersicum (tomato), Commelina benghalensis (wandering Jew), Leonotis nepetifolia (lion's ear), Pavonia urens (muwugula), Cleome gynandra (joobyo), Vernonia amygdalina (mululuza) and Euphorbia heterophylla (Mexican fire plant) [1-3].

Recent phylogenetic analysis, based on partial mitochondrial COI gene sequences, showed that B. tabaci Uganda 1 grouped outside of the B. tabaci cryptic species complex, basal to the monophyletic group of B. tabaci species [1]. Adults of this species, however, are morphologically indistinguishable from other B. tabaci species [1]. As well as being phylogenetically distinct from the sub-Saharan African cassava B. tabaci species, it is likely that B. tabaci Uganda 1 transmit different groups of plant-viruses that constrain sweetpotato production Uganda [5,6].

The genome described here was generated from a Ugandan population of B. tabaci Uganda 1, that was inbred in the laboratory to reduce heterozygosity.

The Bemisia tabaci cryptic species complex

Members of the B. tabaci species complex cause plant damage by feeding on plant-phloem sap, inducing phytotoxic disorders, depositing honeydew on which sooty moulds develop and by vectoring > 300 plant-virus species in the genera Begomovirus, Carlavirus, Crinivirus, Ipomovirus, Polerovirus and Torradovirus [7,8]. Diseases caused by these viruses often spread rapidly with devastating yield losses of up to 100% [9].

Bemisia tabaci sensu lato currently represents a relatively large group (>44) of mostly unresolved cryptic species, as inferred from phylogenetic species delimitation studies [10,11]. These morphologically indistinguishable species differ from one another not only in their genetic relatedness, but also in various biological traits such as plant host-range breadth, fecundity, insecticide resistance, and plant-virus transmission efficiencies.

Bemisia tabaci sensu lato are distributed globally, from tropical to temperate climatic zones and across all continents (except Antarctica) [11]. Most cryptic species in this complex, as currently understood, are geographically restricted, but two of them are highly invasive globally i.e., B. tabaci Middle East-Asia Minor 1 (MEAM1, also referred to as biotype B and Bemisia argentifolii) and B. tabaci Mediterranean (MED, also referred to as biotype Q) [11]. Bemisia tabaci sensu lato live predominantly on herbaceous plant hosts and have been recorded from an exceedingly broad range of host plants (>500 species) [12]. The documented host-plant range of most cryptic species within the complex remains largely incomplete.

Picture credit: Sharon van Brunschot.

Prepublication data sharing

These data are released under Fort Lauderdale principles, as confirmed in the Toronto Statement [13]. Any use of this dataset must abide by the African Cassava Whitefly Project Genomics Consortium data sharing principles. Data producers reserve the right to make the first publication of a global analysis of this data. If you are unsure if you are allowed to publish on this dataset, please contact j.colvin@greenwich.ac.uk and s.vanbrunschot@greenwich.ac.uk to inquire. The full guidelines can be found at cassavawhitefly.org.

Assembly

The Bemisia tabaci Uganda 1 genome was produced by the genomics consortium of the African Cassava Whitefly Project, funded by the Bill & Melinda Gates Foundation (Grant Number OPP1058938).

ACWP logo

A field-collected colony (Namulonge, Uganda) was established, maintained and inbred (F6 generation) by Dr Habibu Mugerwa at the quarantine insectary facilities of the Natural Resources Institute, University of Greenwich, United Kingdom.

High-molecular weight genomic DNA was isolated from a pooled sample of F6 inbred haploid male individuals (n=3000). PacBio Sequel library construction and sequencing (7 SMRT cells) were performed by the Earlham Institute (Norwich, United Kingdom).

The current assembly of the B. tabaci Uganda 1 genome was generated by Dr Lahcen Campbell at EMBL-EBI (Hinxton, UK). The length of the B. tabaci Uganda 1 genome was 630.4Mb, housed in 5,713 scaffolds with a scaffold N50 of 177Kb. The assembly was produced using Canu v1.8: unitig read coverage 34.3X on genome size estimate (650Mb). The GC content of the assembly was 38.3%. Repeat content covered 43.9% of the overall genome, predominantly of transposable elements (TE) without complete classification. Of the identified TE, DNA type TE were the most widespread at 5.29% total coverage.

Annotation

RNA-Seq data utilized for genome annotation were deposited to the ENA under the accessions PRJEB28507, PRJEB35304, PRJEB39408, along with publicly available RNA-seq data from three independent short read Illumina PE datasets: SRR1523521 (PRJNA255988); SRR835869 (PRJNA79601); and SRR2001505 (PRJNA282156). Genomic annotation generated with the Ensembl Gene Annotation pipeline [14]. All transcript models are supported by RNA-seq experimental evidence derived from multiple whitefly life-stages. Gene model layering was supported with protein-to-genome alignment of experimentally verified proteins obtained from closely related Hemiptera: Uniprot (2019) and 570 experimentally verified protein genes from the published genome of Bemisia tabaci MEAM1 [15]. The Ensembl Gene Annotation pipeline then implemented transcript consensus filtration to remove unsupported alternate transcript isoform(s).

Small ncRNAs were obtained using a combination of BLAST and Infernal/RNAfold. Pseudogenes were calculated by examining genes with a large percentage of non-biological introns (introns of <10bp), where the gene was covered in repeats, or where the gene was single exon and evidence of a functional multi-exon paralog was found elsewhere in the genome.

lncRNAs were generated via RNA-seq data where no evidence of protein homology or protein domains could be found in the transcript.

For a general in-depth overview of the Gene Annotation pipeline see here: detailed information on the genebuild.

References

  1. Mugerwa et al. (2018) 'African ancestry of New World, Bemisia tabaci-whitefly species.' Scientific Reports, 8(1), 2734. doi: 10.1038/s41598-018-20956-3.
  2. Sseruwagi et al. (2005) 'Genetic diversity of Bemisia tabaci (Gennadius) (Hemiptera: Aleyrodidae) populations and presence of the B biotype and a non-B biotype that can induce silverleaf symptoms in squash, in Uganda'. Annals of Applied Biology, 147(3), 253-265. doi:10.1111/j.1744-7348.2005.00026.x.
  3. Maruthi et al. (2001) 'Mating compatibility, life-history traits, and RAPD-PCR variation in Bemisia tabaci associated with the cassava mosaic disease pandemic in East Africa'. Entomologia Experimentalis et Applicata, 99(1), 13-23. doi:10.1046/j.1570-7458.2001.00797.x.
  4. Legg (1999) 'Emergence, spread and strategies for controlling the pandemic of cassava mosaic virus disease in east and central Africa'. Crop Protection, 18(10), 627-637. doi:10.1016/S0261-2194(99)00062-9.
  5. Ndunguru et al. (2009) 'Assessing the sweetpotato virus disease and its associated vectors in northwestern Tanzania and central Uganda'. African Journal of Agricultural Research, 4(4), 334-343. doi:10.5897/AJAR.9000069.
  6. Aritua et al. (2007) 'Incidence of five viruses infecting sweetpotatoes in Uganda; the first evidence of Sweet potato caulimo-like virus in Africa'. Plant Pathology, 56(2), 324-331. doi:10.1111/j.1365-3059.2006.01560.x.
  7. Fiallo-Olivé et al. (2020) 'Transmission of begomoviruses and other whitefly-borne viruses: dependence on the vector species'. Phytopathology, 110(1), 10-17. doi:10.1094/phyto-07-19-0273-fi.
  8. Gilbertson et al. (2015) 'Role of the insect supervectors Bemisia tabaci and Frankliniella occidentalis in the emergence and global spread of plant viruses'. Annual Review of Virology, 2 (1), 67-93. doi:10.1146/annurev-virology-031413-085410.
  9. Colvin et al. (2004) 'Dual begomovirus infections and high Bemisia tabaci populations: two factors driving the spread of a cassava mosaic disease pandemic'. Plant Pathology, 53(5), 577-584. doi:10.1111/j.0032-0862.2004.01062.x.
  10. Kanakala et al. (2019) 'Global genetic diversity and geographical distribution of Bemisia tabaci and its bacterial endosymbionts'. PLoS ONE, 14(3), e0213946. doi:10.1371/journal.pone.0213946.
  11. De Barro et al. (2011) 'Bemisia tabaci: a statement of species status'. Annual Review of Entomology, 56(1), 1-19. doi:10.1146/annurev-ento-112408-085504.
  12. Oliveira et al. (2001) 'History, current status, and collaborative research projects for Bemisia tabaci'. Crop Protection, 20(9), 709-723. doi:10.1016/S0261-2194(01)00108-9.
  13. A Toronto International Data Release Workshop (2009) 'Prepublication data sharing'. Nature, 461(7261), 168-170. doi:10.1038/461168a.
  14. Aken et al. (2016) ‘The Ensembl gene annotation system’. Database, Volume 2016. doi:10.1093/database/baw093.
  15. Chen et al. (2016) 'The draft genome of whitefly Bemisia tabaci MEAM1, a global crop pest, provides novel insights into virus transmission, host adaptation, and insecticide resistance'. BMC Biology 14, 110. doi:10.1186/s12915-016-0321-y.

More information

General information about this species can be found in Wikipedia

Statistics

Summary

AssemblyUganda1_n5713_630Mb, INSDC Assembly GCA_903994095.1,
Database version107.1
Golden Path Length630,407,635
Genebuild byEnsembl
Genebuild methodFull genebuild
Data sourceEnsembl Metazoa

Gene counts

Coding genes12,749
Non coding genes968
Small non coding genes173
Long non coding genes790
Misc non coding genes5
Pseudogenes85
Gene transcripts25,101