Bombyx mori Assembly and Gene Annotation

About Bombyx mori

The silkworm, Bombyx mori, has been used for silk production for over 5,000 years, and continues to be an economically important species. B. mori is a fully domesticated insect, and due to its ease of culture, the silkworm has become a model organism in the study of lepidopteran and arthropod biology [1,2].

Picture credit (Creative Commons BY-SA 3.0): Wikimedia Commons 2009

Assembly

The Bombyx mori genome was assembled by the International Silkworm Genome Sequencing Consortium, a joint collaboration of the National Institute of Agrobiological Sciences (NIAS) in Japan and the Southwest University in Chongqing City, China. The consortium integrated two early assemblies into a single high-quality assembly with ~10x coverage of the genome [3]. Approximately 87% of the scaffolds are assembled into 28 linkage groups; this is not represented in Ensembl Metazoa, but is available at KAIKObase [4]. In release 22 of Ensembl Metazoa (March 2014) the assembly was updated to remove contigs (primarily bacterial contaminants and duplicate regions) that were not part of the canonical INSDC record.

There are two sets of scaffold names in use for Bombyx mori. The INSDC records have scaffold names like ":_Bm_scaf586_strain:_p50T_Dazao.", but these have been abbreviated in Ensembl Metazoa (from release 22 onwards), to "scaf586", for example. (The longer INSDC nomenclature is listed as a synonym, however, so searches and uploads using that format should still go to the right place.) These scaffold numbers are also used by KAIKObase. Earlier Ensembl Metazoa releases used the nomenclature from SilkDB, which is formatted like "nscaf29"; note that the numbers in these names do not correpond to those in the INSDC names. The assembly converter tool maps between our current assembly ("GCA_000151625.1") and the earlier version ("Bmor1").

Annotation

Aggregation of ab intio and similarity based gene predictions was achieved using a GLEAN-like algorithm resulting in over 14,000 gene models. The annotations displayed in Ensembl Metazoa are imported from SilkDB [5]. Changes to the scaffolds in release 22 of Ensembl Metazoa (see Assembly section) resulted in some genes mapping to new positions; in all but two cases the DNA sequence is unchanged. The exceptions are BGIBMGA009343, which lost 8 exons that were on a duplicated region, and BGIBMGA014289 which is truncated because part of the gene was on a contig believed to be a bacterial contaminant. Non-coding RNA genes were added using the Ensembl Genomes pipeline, and BLAST hits and protein features have been computed.

References

  1. The genetics and genomics of the silkworm, Bombyx mori.
    Goldsmith MR, Shimada T, Abe H. 2005. Annual Review of Entomology. 50:71-100.
  2. Advances in silkworm studies accelerated by the genome sequencing of Bombyx mori.
    Xia Q, Li S, Feng Q. 2014. Annual Review of Entomology. 59:513-536.
  3. The genome of a lepidopteran model insect, the silkworm Bombyx mori.
    International Silkworm Genome Consortium. 2008. Insect Biochemistry and Molecular Biology. 38:1036-1045.
  4. KAIKObase: an integrated silkworm genome database and data mining tool.
    Shimomura M, Minami H, Suetsugu Y, Ohyanagi H, Satoh C, Antonio B, Nagamura Y, Kadono-Okuda K, Kajiwara H, Sezutsu H et al. 2009. BMC Genomics. 10:486.
  5. SilkDB v2.0: a platform for silkworm (Bombyx mori) genome biology.
    Duan J, Li R, Cheng D, Fan W, Zha X, Cheng T, Wu Y, Wang J, Mita K, Xiang Z et al. 2010. Nucleic Acids Research. 38:D453-6.

More information

General information about this species can be found in Wikipedia.

Statistics

Summary

AssemblyASM15162v1, INSDC Assembly GCA_000151625.1, Feb 2013
Database version94.1
Base Pairs431,707,935
Golden Path Length481,803,763
Genebuild bySilkDB
Genebuild methodImport
Data sourceSilkDB

Gene counts

Coding genes14,623
Non coding genes1,743
Small non coding genes1,743
Gene transcripts29,663

About this species