Drosophila melanogaster Assembly and Gene Annotation
About Drosophila melanogaster
Drosophila melanogaster is a cosmopolitan species of fruitfly that has been used as a model organism for over a hundred years, particularly with respect to genetics and developmental biology. It was the second metazoan (the first being Caenorhabditis elegans) to have its genome sequenced , and was one of 12 fruitfly genomes included in a large comparative study . Ensembl Genomes imports data from FlyBase, who also have much more information about the biology of Drosophila melanogaster, and a phylogeny of the 12 sequenced fruitfly species.
Picture credit (Creative Commons BY-NC-SA 2.0 FR): Nicolas Gompel 2008. Image shows a female fly.
Ensembl Metazoa uses the Berkeley Drosophila Genome Project (BDGP) assembly release 6 (July 2014). In contrast to release 5, regions without a chromosome assignment are stored as distinct scaffolds, rather than on a single pseudo-chromosome, and heterochromatin regions are incorporated in the chromosomal arms.
Protein-coding and RNA genes were imported from FlyBase, release dmel_r6.32 (FB2020_01).
Repeats were annotated with the Ensembl Genomes repeat feature pipeline. There are: 222,250 Low complexity (Dust) features, covering 8 Mb (5.5% of the genome); 42,298 RepeatMasker features (with a custom library), covering 20 Mb (13.7% of the genome); 41,694 RepeatMasker features (with the RepBase library), covering 26 Mb (17.8% of the genome); 72,600 Tandem repeats (TRF) features, covering 6 Mb (4.1% of the genome).
Protein domains were annotated with the Ensembl Genomes protein feature pipeline.
- The genome sequence of Drosophila
Adams MD, Celniker SE, Holt RA, Evans CA, Gocayne JD, Amanatides PG, Scherer SE, Li PW, Hoskins RA, Galle RF et al. 2000. Science. 287:2185-2195.
- Evolution of genes and genomes on the Drosophila
Drosophila 12 Genomes Consortium, Clark AG, Eisen MB, Smith DR, Bergman CM, Oliver B, Markow TA, Kaufman TC, Kellis M, Gelbart W et al. 2007. Nature. 450:203-218.
|Assembly||BDGP6.32, INSDC Assembly GCA_000001215.4,|
|Golden Path Length||143,726,002|
|Non coding genes||4,044|
|Small non coding genes||4,044|