Drosophila melanogaster (Fruit fly) Assembly and Gene Annotation
About Drosophila melanogaster
Drosophila melanogaster is a cosmopolitan species of fruitfly that has been used as a model organism for over a hundred years, particularly with respect to genetics and developmental biology. It was the second metazoan (the first being Caenorhabditis elegans) to have its genome sequenced [1], and was one of 12 fruitfly genomes included in a large comparative study [2]. Ensembl Genomes imports data from FlyBase, who also have much more information about the biology of Drosophila melanogaster, and a phylogeny of the 12 sequenced fruitfly species.
Picture credit (Creative Commons BY-NC-SA 2.0 FR): Nicolas Gompel 2008. Image shows a female fly.
Ensembl Metazoa uses the Berkeley Drosophila Genome Project (BDGP) assembly release 6 (July 2014). In contrast to release 5, regions without a chromosome assignment are stored as distinct scaffolds, rather than on a single pseudo-chromosome, and heterochromatin regions are incorporated in the chromosomal arms.
Protein-coding and RNA genes were imported from FlyBase, release dmel_r6.46 (FB2022_03).
Repeats were annotated with the Ensembl Genomes repeat feature pipeline. There are: 222,250 Low complexity (Dust) features, covering 8 Mb (5.5% of the genome); 48,677 RepeatMasker features (with a custom library), covering 28 Mb (19.5% of the genome); 1426 RepeatMasker features (with the RepBase library), covering 0 Mb (0.1% of the genome); 72,600 Tandem repeats (TRF) features, covering 6 Mb (4.1% of the genome).
Protein domains were annotated with the Ensembl Genomes protein feature pipeline.
- The genome sequence of Drosophila
Adams MD, Celniker SE, Holt RA, Evans CA, Gocayne JD, Amanatides PG, Scherer SE, Li PW, Hoskins RA, Galle RF et al. 2000. Science. 287:2185-2195. - Evolution of genes and genomes on the Drosophila
Drosophila 12 Genomes Consortium, Clark AG, Eisen MB, Smith DR, Bergman CM, Oliver B, Markow TA, Kaufman TC, Kellis M, Gelbart W et al. 2007. Nature. 450:203-218.
Assembly | BDGP6.46, INSDC Assembly GCA_000001215.4, |
Database version | 113.10 |
Golden Path Length | 143,726,002 |
Genebuild by | FlyBase |
Genebuild method | Import |
Data source | FlyBase |
Gene counts
Coding genes | 13,986 |
Non coding genes | 4,054 |
Small non coding genes | 4,054 |
Pseudogenes | 340 |
Gene transcripts | 41,620 |