What's New in Release 37
- Updated data
- Updated cross references
- Updated protein features
- Updated BioMarts
New Ensembl Genomes Archive Sites
Ensembl Genomes now has archive sites for all divisions. These can be found at the following URLs:
The archive sites will allow researchers to access data from old releases in our web-based tools, and additionally will be able to display track hubs containing alignments and features located on older versions of genome assemblies that have since been upgraded in the live site. Archive sites will be searchable and BioMarts will be available where they were produced for the site when live. Schema and API versions for archive sites will be the same as when the data was released, i.e. archive sites will not be updated to use the most recent versions. Ensembl tools will not be active in the initial release, but we are hoping to enable these shortly. Major bugs (i.e. those impeding the usability of the site) will be fixed, but minor bugs will not be.
The first release of the archive sites contains content from release 37. New archive sites will be released at least once a year, under URLs indicating the date of first release of the data they contain. The previously existing archive for Ensembl Plants, http://archive.plants.ensembl.org, will continue to be available at this URL, but also as http://mar2016-plants.ensembl.org, in accordance with the new naming scheme. As previously, data from all recent releases will continue to be available for download at ftp://ftp.ensemblgenomes.org.
Automated RNA gene annotation
Most gene sets that are imported into Ensembl Metazoa do not have RNA gene annotation, and in these cases we perform automated annotation. For release 36 of Ensembl Metazoa RNA genes were annotated on all species except those imported from FlyBase, VectorBase, and WormBase. The RNA gene annotation uses three sources:
- miRBase provide genomic locations of precursor microRNA genes for a subset of species (Apis mellifera, Heliconius melpomene, Nasonia vitripennis, Strongylocentrotus purpuratus).
- tRNAscan-SE is used to predict tRNA genes (with a score threshold of 65).
- Rfam covariance model alignments are used to annotate RNA genes of all classes except tRNA. Models are restricted to those that are taxonomically appropriate, and are filtered to prevent overlapping genes (on either strand).
RNA genes have been updated for 26 species (21 of which had out-dated RNA gene annotation from earlier Ensembl Metazoa releases, the remaining 5 having partial annotation from the geneset provider); and added to 11 species which previously had no RNA gene annotation.
The chart below shows the numbers of RNA genes annotated, broken down by biotype, across all 37 species (click image for a larger version).
Ensembl Metazoa recieves funding from Infravec2, to serve as a data delivery mechanism for that project. Infravec2 is an international and interdisciplinary research project on insect vectors of human and animal disease, including mosquitoes, sandflies and other flies. This project has received funding from the European Unions Horizon 2020 research and innovation programme under grant agreement No 731060.