Crassostrea gigas Assembly and Gene Annotation

About Crassostrea gigas

Crassostrea gigas, the Pacific oyster, is native to Asia, but has been introduced around the world, intentionally for aquaculture, and accidentally such that it is sometimes viewed as an invasive species. Oysters are sessile bivalve molluscs, and C. gigas is a large estuarine species that thrives in a wide range of environmental conditions, helping to make it the most commercially important oyster species in the world. The Pacific oyster has many characteristics that are representative of molluscs and, more generally, lophotrochozoans, and its genome provides insight into metazoan evolution, in addition to being a valuable resource for studies of aquaculture [1].

Picture credit (Creative Commons BY-SA 3.0): Jan Johan ter Poorten 2007

Assembly

The Crassostrea gigas genome was assembled in a hierarchical manner with a combination of fosmid pooling and whole-genome shotgun sequencing. The assembly consists of 7,658 scaffolds, with a genome size of approximately 558 Mb. The assembly displayed in Ensembl Genomes is derived from INSDC records, and excludes ~4000 very short scaffolds from the original assembly data.

Annotation

The protein-coding genes displayed in Ensembl Genomes are imported from the set of GigaDB genes that were accepted as INSDC records (1928 genes were rejected when submitted to INSDC), plus 2 additional genes in INDSC that are not in GigaDB. Non-coding RNA genes were added using the Ensembl Genomes pipeline, and BLAST hits and protein features have been computed.

References

  1. The oyster genome reveals stress adaptation and complexity of shell formation.
    Zhang G, Fang X, Guo X, Li L, Luo R, Xu F, Yang P, Zhang L, Wang X, Qi H et al. 2012. Nature. 490:49-54.

More information

General information about this species can be found in Wikipedia.

Statistics

Summary

Assemblyoyster_v9, INSDC Assembly GCA_000297895.1, Sep 2012
Database version94.1
Base Pairs491,850,583
Golden Path Length557,717,710
Genebuild byENA
Genebuild methodImported from ENA
Data sourceGigaDB

Gene counts

Coding genes26,101
Non coding genes497
Small non coding genes494
Long non coding genes3
Gene transcripts26,598

About this species