Drosophila pseudoobscura pseudoobscura (Fruit fly, MV2-25) (Dpse_3.0)

Drosophila pseudoobscura pseudoobscura (Fruit fly, MV2-25) Assembly and Gene Annotation

About Drosophila pseudoobscura

Drosophila pseudoobscura is a North American fruitfly that has been used extensively in genetic studies, particularly with respect to speciation. It was the second fruitfly to have its genome sequenced [1], and was one of 12 fruitfly genomes included in a large comparative study [2]. Ensembl Genomes imports data from FlyBase, who also have much more information about the biology of Drosophila pseudoobscura, and a phylogeny of the 12 sequenced fruitfly species.

Picture credit: MA Hanson Creative Commons Attribution 4.0 via Wikimedia Commons (Image source)

Assembly

Ensembl Metazoa uses the v3.0 genome assembly of Drosophila pseudoobscura [3]. The genome was sequenced and assembled by the Human Genome Sequencing Center at Baylor College of Medicine (BCM-HGSC); further details are provided by FlyBase.

Annotation

Protein-coding and RNA genes, which were annotated with the NCBI eukaryotic genome annotation pipeline, were imported from FlyBase, release dpse_r3.04 (FB2017_04).

Repeats were annotated with the Ensembl Genomes repeat feature pipeline. There are: 286,881 Low complexity (Dust) features, covering 13 Mb (8.8% of the genome); 25,800 RepeatMasker features (with the RepBase library), covering 10 Mb (6.7% of the genome); 181,856 Tandem repeats (TRF) features, covering 12 Mb (7.7% of the genome).

Protein domains were annotated with the Ensembl Genomes protein feature pipeline.

References

  1. Comparative genome sequencing of Drosophila pseudoobscura: chromosomal, gene, and cis-element evolution.
    Richards S, Liu Y, Bettencourt BR, Hradecky P, Letovsky S, Nielsen R, Thornton K, Hubisz MJ, Chen R, Meisel RP et al. 2005. Genome Research. 15:1-18.
  2. Evolution of genes and genomes on the Drosophila phylogeny.
    Drosophila 12 Genomes Consortium, Clark AG, Eisen MB, Smith DR, Bergman CM, Oliver B, Markow TA, Kaufman TC, Kellis M, Gelbart W et al. 2007. Nature. 450:203-218.
  3. Mind the gap: upgrading genomes with Pacific Biosciences RS long-read sequencing technology.
    English AC, Richards S, Han Y, Wang M, Vee V, Qu J, Qin X, Muzny DM, Reid JG, Worley KC et al. 2012. PLoS ONE. 7:e47768.

Statistics

Summary

AssemblyDpse_3.0, INSDC Assembly GCA_000001765.2,
Database version113.4
Golden Path Length152,696,384
Genebuild byFlyBase
Genebuild methodImport
Data sourceFlyBase

Gene counts

Coding genes14,574
Non coding genes2,449
Small non coding genes2,449
Pseudogenes273
Gene transcripts27,888