Ensembl Compara Schema Documentation

Introduction

This document describes the tables that make up the Ensembl Compara schema. Tables are listed grouped in different categories, and the purpose of each table is explained. Several examples are also given. They are intended to allow people to familiarise themselves with the schema. The overall diagram can be found here.

List of the tables:

Taxonomy and species-tree

ncbi_taxa_node
ncbi_taxa_name
species_tree_node
species_tree_root
species_tree_node_tag
species_tree_node_attr

Genomes

genome_db
dnafrag
dnafrag_alt_region
sequence
gene_member
seq_member
exon_boundaries
other_member_sequence

Synteny

synteny_region
dnafrag_region

Genomic alignments

genomic_align_block
genomic_align_tree
genomic_align

Conservation

conservation_score
constrained_element

Gene trees and homologies

peptide_align_feature
gene_align
gene_align_member
gene_tree_node
gene_tree_root
gene_tree_node_tag
gene_tree_root_tag
gene_tree_root_attr
gene_tree_node_attr
gene_tree_object_store
homology
homology_member

Extra annotations on members

gene_member_hom_stats
seq_member_projection_stable_id
seq_member_projection
external_db
member_xref
gene_member_qc

Protein families

family
family_member

Profile HMMs

hmm_profile
hmm_annot
hmm_curated_annot

Stable-ID mapping

mapping_session
stable_id_history

Gene gain/loss trees

CAFE_gene_family
CAFE_species_gene

Dataset description

These are general tables used in the Compara schema

Column	Type	Default value	Description	Index
meta_id	INT	-	Internal unique ID for the table	primary key
species_id	INT	1	Only used in core databases	unique key: species_key_value_idx key: species_value_idx
meta_key	VARCHAR(64)	-	Key for the key/value pair	unique key: species_key_value_idx
meta_value	TEXT	-	Value for the key/value pair	unique key: species_key_value_idx key: species_value_idx

meta_id	species_id	meta_key	meta_value
1	NULL	schema_version	114

Column	Type	Default value	Description	Index
species_set_id	INT	-	Internal (non-unique) ID for the table	primary key
name	varchar(255)	''	Name of the species set (e.g. "H.sap" for one species, "H.sap-M.mus" for two species and "amniotes" for larger sets)
size	INT	-	Number of species in the set
first_release	smallint	NULL	The first release this set genome was present in
last_release	smallint	NULL	The last release this set was present in, or NULL if it is still current

ssh.species_set_id	ssh.name	ssh.size	gdb.genome_db_id	gdb.name	gdb.assembly
84239	primates	10	150	homo_sapiens	GRCh38
84239	primates	10	153	chlorocebus_sabaeus	ChlSab1.1
84239	primates	10	199	nomascus_leucogenys	Nleu_3.0
84239	primates	10	206	microcebus_murinus	Mmur_3.0
84239	primates	10	209	gorilla_gorilla	gorGor4
84239	primates	10	210	pan_paniscus	panpan1.1
84239	primates	10	221	pan_troglodytes	Pan_tro_3.0
84239	primates	10	361	macaca_mulatta	Mmul_10
84239	primates	10	465	macaca_fascicularis	Macaca_fascicularis_6.0
84239	primates	10	476	pongo_abelii	Susie_PABv2
85643	primates	24	124	otolemur_garnettii	OtoGar3
85643	primates	24	150	homo_sapiens	GRCh38
85643	primates	24	153	chlorocebus_sabaeus	ChlSab1.1
85643	primates	24	197	cercocebus_atys	Caty_1.0
85643	primates	24	199	nomascus_leucogenys	Nleu_3.0
85643	primates	24	200	macaca_nemestrina	Mnem_1.0
85643	primates	24	201	mandrillus_leucophaeus	Mleu.le_1.0
85643	primates	24	203	propithecus_coquereli	Pcoq_1.0
85643	primates	24	204	rhinopithecus_bieti	ASM169854v1
85643	primates	24	205	saimiri_boliviensis_boliviensis	SaiBol1.0
85643	primates	24	206	microcebus_murinus	Mmur_3.0
85643	primates	24	209	gorilla_gorilla	gorGor4
85643	primates	24	210	pan_paniscus	panpan1.1
85643	primates	24	216	rhinopithecus_roxellana	Rrox_v1
85643	primates	24	217	carlito_syrichta	Tarsius_syrichta-2.0.1
85643	primates	24	219	aotus_nancymaae	Anan_2.0
85643	primates	24	221	pan_troglodytes	Pan_tro_3.0
85643	primates	24	302	prolemur_simus	Prosim_1.0
85643	primates	24	361	macaca_mulatta	Mmul_10
85643	primates	24	465	macaca_fascicularis	Macaca_fascicularis_6.0
85643	primates	24	472	callithrix_jacchus	mCalJac1.pat.X
85643	primates	24	474	papio_anubis	Panubis1.0
85643	primates	24	476	pongo_abelii	Susie_PABv2
85643	primates	24	486	cebus_imitator	Cebus_imitator-1.0
89648	primates	31	124	otolemur_garnettii	OtoGar3
89648	primates	31	150	homo_sapiens	GRCh38
89648	primates	31	153	chlorocebus_sabaeus	ChlSab1.1
89648	primates	31	197	cercocebus_atys	Caty_1.0
89648	primates	31	199	nomascus_leucogenys	Nleu_3.0
89648	primates	31	200	macaca_nemestrina	Mnem_1.0
89648	primates	31	201	mandrillus_leucophaeus	Mleu.le_1.0
89648	primates	31	203	propithecus_coquereli	Pcoq_1.0
89648	primates	31	204	rhinopithecus_bieti	ASM169854v1
89648	primates	31	205	saimiri_boliviensis_boliviensis	SaiBol1.0
89648	primates	31	206	microcebus_murinus	Mmur_3.0
89648	primates	31	208	colobus_angolensis_palliatus	Cang.pa_1.0
89648	primates	31	209	gorilla_gorilla	gorGor4
89648	primates	31	210	pan_paniscus	panpan1.1
89648	primates	31	216	rhinopithecus_roxellana	Rrox_v1
89648	primates	31	217	carlito_syrichta	Tarsius_syrichta-2.0.1

See also:

method_link_species_set
genome_db
species_set

species_set

Column	Type	Default value	Description	Index
species_set_id	INT	-	Internal ID for the table, foreign key to species_set_header	primary key
genome_db_id	INT	-	External reference to genome_db_id in the genome_db table	primary key

species_set_id	species
35674	choloepus_hoffmanni,homo_sapiens
35676	oryctolagus_cuniculus,homo_sapiens
35678	otolemur_garnettii,homo_sapiens
35680	dasypus_novemcinctus,homo_sapiens
35682	erinaceus_europaeus,homo_sapiens
35684	echinops_telfairi,homo_sapiens
35686	loxodonta_africana,homo_sapiens
35687	notamacropus_eugenii,homo_sapiens
35689	ochotona_princeps,homo_sapiens
35690	procavia_capensis,homo_sapiens

Column	Type	Default value	Description	Index
species_set_id	INT	-	External reference to species_set_id in the species_set table	primary key
tag	varchar(50)	-	Tag name	primary key key: tag
value	mediumtext	-	Tag value

Column	Type	Default value	Description	Index
method_link_id	INT	-	Internal unique ID	primary key
type	varchar(50)	''	The code used to refer to this linking method	unique key: type
class	varchar(50)	''	Description of type of data associated with the \"type\" field. Used to match similar types
display_name	varchar(255)	''	Plain English description of this method

method_link_id	type	class	display_name
10	PECAN	GenomicAlignBlock.multiple_alignment	Mercator-Pecan
11	GERP_CONSTRAINED_ELEMENT	ConstrainedElement.constrained_element	GERP Constrained Elements
13	EPO	GenomicAlignTree.ancestral_alignment	EPO
14	EPO_EXTENDED	GenomicAlignTree.tree_alignment	EPO-Extended
16	LASTZ_NET	GenomicAlignBlock.pairwise_alignment	LastZ
19	LASTZ_PATCH	GenomicAlignBlock.pairwise_alignment	LastZ-patch
22	CACTUS_HAL	GenomicAlignBlock.multiple_alignment	Cactus
26	CACTUS_DB	GenomicAlignBlock.multiple_alignment	Cactus
101	SYNTENY	SyntenyRegion.synteny	synteny
201	ENSEMBL_ORTHOLOGUES	Homology.homology	orthologues
202	ENSEMBL_PARALOGUES	Homology.homology	paralogues
205	ENSEMBL_PROJECTIONS	Homology.homology	patch projections
401	PROTEIN_TREES	ProteinTree.protein_tree_node	protein-trees
402	NC_TREES	NCTree.nc_tree_node	ncRNA-trees
501	GERP_CONSERVATION_SCORE	ConservationScore.conservation_score	GERP Conservation Scores
600	SPECIES_TREE	SpeciesTree.species_tree_root	species-tree

Column	Type	Default value	Description	Index
method_link_species_set_id	INT	-	Internal unique ID	primary key
method_link_id	INT	-	External reference to method_link_id in the method_link table	unique key: method_link_id
species_set_id	INT	-	External reference to species_set_id in the species_set table	unique key: method_link_id
name	varchar(255)	''	Human-readable description for this method_link_species_set
source	varchar(255)	'ensembl'	Source of the data. Currently either "ensembl" or "ucsc" if data were imported from UCSC
url	varchar(255)	''	A URL where you can find the orignal data if they were imported
first_release	smallint	NULL	The first release this analysis was present in
last_release	smallint	NULL	The last release this analysis was present in, or NULL if it is still current

method_link_species_set_id	method_link_id	species_set_id	name	source	first_release	last_release
2006	13	84239	10 primates EPO	ensembl	106	NULL
2038	13	85062	17 sauropsids EPO	ensembl	107	NULL
7045	13	86963	32 fish EPO	ensembl	112	NULL
9580	13	89675	44 eutherian mammals EPO	ensembl	114	NULL
9588	13	89679	21 murinae EPO	ensembl	114	NULL

Column	Type	Default value	Description	Index
method_link_species_set_id	INT	-	External reference to method_link_species_set_id in the method_link_species_set table	primary key
tag	varchar(50)	-	Tag name	primary key key: tag
value	mediumtext	-	Tag value

Column	Type	Default value	Description	Index
method_link_species_set_id	INT	-	internal unique ID for the orthologs	primary key
n_goc_0	int	NULL	the number of orthologs with no gene order conservation among their neighbours
n_goc_25	int	NULL	the number of orthologs with 25% gene order conservation among their neighbours
n_goc_50	int	NULL	the number of orthologs with 50% gene order conservation among their neighbours
n_goc_75	int	NULL	the number of orthologs with 75% gene order conservation among their neighbours
n_goc_100	int	NULL	the number of orthologs with 100% gene order conservation among their neighbours
perc_orth_above_goc_thresh	float	NULL	the percentage of orthologs above the goc threshold
goc_quality_threshold	int	NULL	the chosen threshold for "high quality" orthologs based on gene order conservation
wga_quality_threshold	int	NULL	the chosen threshold for "high quality" orthologs based on the whole genome alignments coverage of homologous pairs
perc_orth_above_wga_thresh	float	NULL	the percentage of orthologs above the wga threshold
threshold_on_ds	int	NULL	the threshold_on_ds

Column	Type	Default value	Description	Index
taxon_id	int(10)	-	The NCBI Taxonomy ID	primary key
parent_id	int(10)	-	The parent taxonomy ID for this node (refers to ncbi_taxa_node.taxon_id)	key: parent_id
rank	char(32)	''	E.g. kingdom, family, genus, etc.	key: rank
genbank_hidden_flag	tinyint(1)	0	Boolean value which defines whether this rank is used or not in the abbreviated lineage
left_index	int(10)	0	Sub-set left index. All sub-nodes have left_index and right_index values larger than this left_index	key: left_index
right_index	int(10)	0	Sub-set right index. All sub-nodes have left_index and right_index values smaller than this right_index	key: right_index
root_id	int(10)	1	The root taxonomy ID for this node (refers to ncbi_taxa_node.taxon_id)

n2.taxon_id	n2.parent_id	na.name	n2.rank	n2.left_index	n2.right_index
1	0	root	no rank	1	5214706
131567	1	cellular organisms	no rank	506818	5173789
2759	131567	Eukaryota	superkingdom	1673509	5173788
33154	2759	Opisthokonta	clade	2233372	5073471
33208	33154	Metazoa	kingdom	2637743	5072726
6072	33208	Eumetazoa	clade	2651232	5072705
33213	6072	Bilateria	clade	2683523	5072704
33511	33213	Deuterostomia	clade	4806998	5071231
7711	33511	Chordata	phylum	4821213	5070772
89593	7711	Craniata	subphylum	4823698	5070729
7742	89593	Vertebrata	clade	4823699	5070726
7776	7742	Gnathostomata	clade	4823700	5069501
117570	7776	Teleostomi	clade	4828499	5069500
117571	117570	Euteleostomi	clade	4828500	5069499
8287	117571	Sarcopterygii	superclass	4944869	5069498
1338369	8287	Dipnotetrapodomorpha	clade	4944882	5069497
32523	1338369	Tetrapoda	clade	4944945	5069496
32524	32523	Amniota	clade	4974616	5069495
40674	32524	Mammalia	class	5041125	5069494
32525	40674	Theria	clade	5041152	5069471
9347	32525	Eutheria	clade	5042621	5069470
1437010	9347	Boreoeutheria	clade	5043362	5069445
314146	1437010	Euarchontoglires	superorder	5055459	5069440
9443	314146	Primates	order	5055536	5057795
376913	9443	Haplorrhini	suborder	5056095	5057794
314293	376913	Simiiformes	infraorder	5056096	5057751
9526	314293	Catarrhini	parvorder	5056881	5057746
314295	9526	Hominoidea	superfamily	5057554	5057745
9604	314295	Hominidae	family	5057661	5057744
207598	9604	Homininae	subfamily	5057662	5057713
9605	207598	Homo	genus	5057695	5057712
9606	9605	Homo sapiens	species	5057696	5057701

root_id	method_link_species_set_id	label
200600000	2006	default
203800000	2038	default
203800033	2039	default
205000000	2050	default
200700077	7045	default
200700140	7046	default
873400000	8734	default
875100000	8751	default
204100254	9586	default
958300009	9584	default
958000000	9580	default
958000087	9581	default
958800000	9588	default
4017800000	40178	default
4018000000	40180	default
4017600342	40176	cafe
4017600000	40176	default
4017700000	40177	default
4018100000	40181	default
4017900566	40179	cafe
4017900000	40179	default
4017900224	40179	full_species_tree
4018100027	60031	Ensembl
4018100592	60031	NCBI Taxonomy
958900000	9589	default

genome_db_id	taxon_id	name	assembly	genebuild	has_karyotype	is_good_for_alignment	genome_component	strain_name	display_name	locator	first_release	last_release
481	9031	gallus_gallus	bGalGal1.mat.broiler.GRCg7b	2021-09-Ensembl	1	1	NULL	reference breed	Chicken	NULL	107	NULL
150	9606	homo_sapiens	GRCh38	2014-01-Ensembl	1	1	NULL	NULL	Human	NULL	76	NULL

Column	Type	Default value	Description	Index
dnafrag_id	bigint	-	Internal unique ID	primary key
length	int	0	The total length of the dnafrag
name	varchar(255)	''	Name of the DNA sequence (e.g., the name of the chromosome)	unique key: name
genome_db_id	INT	-	External reference to genome_db_id in the genome_db table	unique key: name
coord_system_name	varchar(40)	''	Refers to the coord system in which this dnafrag has been defined
is_reference	tinyint(1)	1	Boolean, whether dnafrag is reference (1) or non-reference (0) eg haplotype
cellular_component	ENUM: NUC MT PT OTHER	'NUC'	Either "NUC", "MT", "PT" or "OTHER". Represents which organelle genome the dnafrag is part of
codon_table_id	TINYINT	1	Integer. The numeric identifier of the codon-table that applies to this dnafrag (https://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi)

Column	Type	Default value	Description	Index
taxon_id	int(10)	-	External reference to taxon_id in ncbi_taxa_node	key: taxon_id
name	varchar(500)	-	Information assigned to this taxon_id	key: name
name_class	varchar(50)	-	Type of information. e.g. common name, genbank_synonym, scientif name, etc.	key: name_class

Column	Type	Default value	Description	Index
node_id	bigint	-	Internal unique ID	primary key
parent_id	bigint	NULL	Link to the parent node	key: parent_id
root_id	bigint	NULL	Link to the root node	key: root_id
left_index	INT	0	Internal index	key: root_id
right_index	INT	0	Internal index
distance_to_parent	double	'1'	Phylogenetic distance between this node and its parent
taxon_id	INT	NULL	Link to NCBI taxon node
genome_db_id	INT	NULL	Link to the genome_db
node_name	VARCHAR(255)	NULL	A name that can be set to the taxon name or any other arbitrary name

Column	Type	Default value	Description	Index
genome_db_id	INT	-	Internal unique ID for this table	primary key
taxon_id	INT	NULL	External reference to taxon_id in the ncbi_taxa_node table
name	varchar(128)	''	Species name	unique key: name
assembly	varchar(100)	''	Assembly version of the genome	unique key: name
genebuild	varchar(255)	''	Version of the genebuild
has_karyotype	tinyint(1)	0	Whether the genome has a karyotype
is_good_for_alignment	TINYINT(1)	0	Whether the genome is good enough to be used in multiple alignments
genome_component	varchar(5)	NULL	Only used for polyploid genomes: the name of the genome component	unique key: name
strain_name	varchar(100)	NULL	Name of the particular strain this GenomeDB refers to
display_name	varchar(255)	NULL	Named used for display purposes. Imported from the core databases
locator	varchar(400)	NULL	Used for production purposes or for user configuration in in-house installation.
first_release	smallint	NULL	The first release this genome was present in
last_release	smallint	NULL	The last release this genome was present in, or NULL if it is still current

Column	Type	Default value	Description	Index
dnafrag_id	BIGINT	-	External reference to dnafrag_id in the dnafrag table	primary key
dnafrag_start	INT	-	Position of the first nucleotide from this dnafrag that differs from the reference dnafrag
dnafrag_end	INT	-	Position of the last nucleotide from this dnafrag that differs from the reference dnafrag

dnafrag_id	name	length	dnafrag_start	dnafrag_end
30679870	HSCHR20_1_CTG1	128386	1	128386
30679871	HSCHR20_1_CTG2	118774	1	118774
30679872	HSCHR20_1_CTG3	183433	1	183433
30679873	HSCHR20_1_CTG4	58661	1	58661

Column	Type	Default value	Description	Index
sequence_id	INT	-	Internal unique ID	primary key
length	INT	-	Length of the sequence
sequence	longtext	-	The actual sequence
md5sum	CHAR(32)	-	md5sum	key: md5sum

Column	Type	Default value	Description	Index
gene_member_id	INT	-	Internal unique ID	primary key
stable_id	varchar(128)	-	EnsEMBL stable ID	unique key: genome_db_stable_id key: stable_id
version	INT	0	Version of the stable ID (see EnsEMBL core DB)
source_name	ENUM: ENSEMBLGENE EXTERNALGENE	-	The source of the member	key: source_name
taxon_id	INT	-	External reference to taxon_id in the ncbi_taxa_node table
genome_db_id	INT	NULL	External reference to genome_db_id in the genome_db table	unique key: genome_db_stable_id key: genome_db_id_biotype
biotype_group	ENUM: coding pseudogene snoncoding lnoncoding mnoncoding LRG undefined no_group current_notdumped notcurrent	'coding'	Biotype of this gene.	key: biotype_dnafrag_id_start_end key: genome_db_id_biotype
canonical_member_id	INT	NULL	External reference to seq_member_id in the seq_member table to allow linkage from a gene to its canonical peptide	key: canonical_member_id
description	text	NULL	The description of the gene/protein as described in the core database or from the Uniprot entry
dnafrag_id	bigint	NULL	External reference to dnafrag_id in the dnafrag table. It shows the dnafrag the member is on.	key: dnafrag_id_start key: dnafrag_id_end key: biotype_dnafrag_id_start_end
dnafrag_start	INT	NULL	Starting position within the dnafrag defined by dnafrag_id	key: dnafrag_id_start key: biotype_dnafrag_id_start_end
dnafrag_end	INT	NULL	Ending position within the dnafrag defined by dnafrag_id	key: dnafrag_id_end key: biotype_dnafrag_id_start_end
dnafrag_strand	TINYINT	NULL	Strand in the dnafrag defined by dnafrag_id
display_label	varchar(128)	NULL	Display name (imported from the core database)

Column	Type	Default value	Description	Index
seq_member_id	INT	-	Internal unique ID	primary key key: seq_member_gene_member_id_end
stable_id	varchar(128)	-	EnsEMBL stable ID or external ID (for Uniprot/SWISSPROT and Uniprot/SPTREMBL)	unique key: genome_db_stable_id key: stable_id
version	INT	0	Version of the stable ID (see EnsEMBL core DB)
source_name	ENUM: ENSEMBLPEP ENSEMBLTRANS Uniprot/SPTREMBL Uniprot/SWISSPROT EXTERNALPEP EXTERNALTRANS EXTERNALCDS	-	The source of the member	key: source_name
taxon_id	INT	-	External reference to taxon_id in the ncbi_taxa_node table
genome_db_id	INT	NULL	External reference to genome_db_id in the genome_db table	unique key: genome_db_stable_id
sequence_id	INT	NULL	External reference to sequence_id in the sequence table. May be 0 when the sequence is not available in the sequence table, e.g. for a gene instance	key: sequence_id
gene_member_id	INT	NULL	External reference to gene_member_id in the gene_member table to allow linkage from peptides and transcripts to genes	key: gene_member_id key: seq_member_gene_member_id_end
has_transcript_edits	tinyint(1)	0	Boolean. Whether there are SeqEdits that modify the transcript sequence. When this happens, the (exon) coordinates don't match the transcript sequence
has_translation_edits	tinyint(1)	0	Boolean. Whether there are SeqEdits that modify the protein sequence. When this happens, the protein sequence doesn't match the transcript sequence
description	text	NULL	The description of the gene/protein as described in the core database or from the Uniprot entry
dnafrag_id	bigint	NULL	External reference to dnafrag_id in the dnafrag table. It shows the dnafrag the member is on.	key: dnafrag_id_start key: dnafrag_id_end
dnafrag_start	INT	NULL	Starting position within the dnafrag defined by dnafrag_id	key: dnafrag_id_start
dnafrag_end	INT	NULL	Ending position within the dnafrag defined by dnafrag_id	key: dnafrag_id_end
dnafrag_strand	TINYINT	NULL	Strand in the dnafrag defined by dnafrag_id
display_label	varchar(128)	NULL	Display name (imported from the core database)

Column	Type	Default value	Description	Index
gene_member_id	INT	-	External reference to gene_member_id in the gene_member table to allow querying all the exons of all the translations of a gene	index: gene_member_id
seq_member_id	INT	-	External reference to seq_member_id in the seq_member table to indicate which translation the exons refer to	index: seq_member_id
dnafrag_start	INT	-	Starting position within the dnafrag defined by the dnafrag_id of the seq_member
dnafrag_end	INT	-	Ending position within the dnafrag defined by the dnafrag_id of the seq_member
sequence_length	INT	-	Length of the chunk of the sequence that corresponds to this exon
left_over	TINYINT	0	Phase information (0, 1 or 2) used to produce the "exon_bounded" sequence

Column	Type	Default value	Description	Index
synteny_region_id	INT	-	Internal unique ID	primary key
method_link_species_set_id	INT	-	External reference to method_link_species_set_id in the method_link_species_set table	key: method_link_species_set_id

Ensembl Compara Schema Documentation

Introduction

List of the tables:

Dataset description

Taxonomy and species-tree

Genomes

Synteny

Genomic alignments

Conservation

Gene trees and homologies

Extra annotations on members

Protein families

Profile HMMs

Stable-ID mapping

Gene gain/loss trees

About Us

Get help

Our sister sites

Follow us

Column	Type	Description	Index
synteny_region_id	INT	External reference to synteny_region_id in the synteny_region table	key: synteny key: synteny_reversed
dnafrag_id	bigint	External reference to dnafrag_id in the dnafrag table	key: synteny key: synteny_reversed
dnafrag_start	INT	Position of the first nucleotide from this dnafrag which is in synteny
dnafrag_end	INT	Position of the last nucleotide from this dnafrag which is in synteny
dnafrag_strand	TINYINT	Strand of this region

species_name	dnafrag.name	dnafrag_start	dnafrag_end	dnafrag_strand
oryctolagus_cuniculus	16	10692525	19644506	1
homo_sapiens	10	6409596	14809562	1

Column	Type	Default value	Description	Index
genomic_align_block_id	bigint	-	Internal unique ID	primary key
method_link_species_set_id	INT	0	External reference to method_link_species_set_id in the method_link_species_set table	key: method_link_species_set_id
score	double	NULL	Score returned by the homology search program
perc_id	TINYINT	NULL	Used for pairwise comparison. Defines the percentage of identity between both sequences
length	INT	-	Total length of the alignment
group_id	bigint	NULL	Used to group alignments
level_id	TINYINT	0	Level of orthologous layer. 1 corresponds to the principal layer of orthologous sequences found (the largest), 2 and over are additional layers. Use for building the syntenies (based on level_id = 1 only). Note that level_ids are not computed on whole chromosomes but rather on chunks. This means that level_ids can be inconsistent within an alignment-net.

genomic_align_block_id	method_link_species_set_id	score	perc_id	length	group_id	level_id	direction
12850000000001	1285	8094	NULL	185	12850031204617	1	1
12850000000003	1285	20570	NULL	218	12850029126909	1	1
12850000000018	1285	20570	NULL	295	12850029126909	1	1
12850000000069	1285	10359	NULL	196	12850029126988	1	1

node_id	parent_id	root_id	left_index	right_index	left_node_id	right_node_id	distance_to_parent
20060000000001	NULL	20060000000001	1	10	20060000227302	20060000200466	0.01846908
20060000000002	20060000000001	20060000000001	2	3	20060000227303	20060000200470	0.0117108
20060000000003	20060000000001	20060000000001	4	9	NULL	20060000200467	0.0090058
20060000000004	20060000000003	20060000000001	7	8	20060000227304	20060000200469	0.00254698
20060000000005	20060000000003	20060000000001	5	6	20060000566821	20060000200468	0.00312302
20060000000009	NULL	20060000000009	1	10	20060000188436	20060000085548	0.01846908
20060000000010	20060000000009	20060000000009	2	3	20060000188437	20060000085552	0.0117108
20060000000011	20060000000009	20060000000009	4	9	20060000188438	20060000085549	0.0090058
20060000000012	20060000000011	20060000000009	7	8	20060000188439	20060000085551	0.00254698
20060000000013	20060000000011	20060000000009	5	6	20060000188440	20060000085550	0.00312302

Column	Type	Default value	Description	Index
genomic_align_id	bigint	-	Unique internal ID	primary key
genomic_align_block_id	bigint	-	External reference to genomic_align_block_id in the genomic_align_block table	key: genomic_align_block_id
method_link_species_set_id	INT	0	External reference to method_link_species_set_id in the method_link_species_set table. This information is redundant because it also appears in the genomic_align_block table but it is used to speed up the queries	key: method_link_species_set_id key: dnafrag
dnafrag_id	bigint	0	External reference to dnafrag_id in the dnafrag table	key: dnafrag
dnafrag_start	INT	0	Starting position within the dnafrag defined by dnafrag_id	key: dnafrag
dnafrag_end	INT	0	Ending position within the dnafrag defined by dnafrag_id	key: dnafrag
dnafrag_strand	TINYINT	0	Strand in the dnafrag defined by dnafrag_id
cigar_line	mediumtext	-	Internal description of the aligned sequence
visible	TINYINT	1	Used in self alignments to ensure only one Bio::EnsEMBL::Compara::GenomicAlignBlock is visible when you have more than 1 block covering the same region
node_id	bigint	NULL	External reference to node_id in the genomic_align_tree table	key: node_id

genomic_align_id	genomic_align_block_id	method_link_species_set_id	dnafrag_id	dnafrag_start	dnafrag_end	dnafrag_strand	cigar_line	visible	node_id
12850000000028	12850000000001	1285	16642790	5660	5837	1	129M7D49M	1	NULL
12850000000035	12850000000001	1285	17458342	11305494	11305675	-1	146M2D28MD8M	1	NULL
12850000000001	12850000000003	1285	16642780	885	1102	1	218M	1	NULL
12850000000013	12850000000003	1285	17458332	25290534	25290751	1	218M	1	NULL
12850000000061	12850000000018	1285	16642780	1183	1432	1	13MD11M41D156M3D70M	1	NULL
12850000000099	12850000000018	1285	17458332	25291008	25291302	1	295M	1	NULL
12850000000177	12850000000069	1285	16642780	11774	11969	1	196M	1	NULL
12850000000188	12850000000069	1285	17458332	25296080	25296270	1	168M3D7M2D16M	1	NULL

genome_db.name	dnafrag.name	dnafrag_start	dnafrag_end	dnafrag_strandstr	cigar_line
danio_rerio	KN150510.1	5660	5837	1	129M7D49M
oryzias_latipes	24	11305494	11305675	-1	146M2D28MD8M
danio_rerio	KN150174.1	885	1102	1	218M
oryzias_latipes	14	25290534	25290751	1	218M
danio_rerio	KN150174.1	1183	1432	1	13MD11M41D156M3D70M
oryzias_latipes	14	25291008	25291302	1	295M
danio_rerio	KN150174.1	11774	11969	1	196M
oryzias_latipes	14	25296080	25296270	1	168M3D7M2D16M

Column	Type	Default value	Description	Index
constrained_element_id	bigint	-	Internal (but unique) ID	key: constrained_element_id_idx
dnafrag_id	bigint	-	External reference to dnafrag_id in the dnafrag table	key: mlssid_dfid_dfstart_dfend_idx
dnafrag_start	INT	-	Start of the constrained element	key: mlssid_dfid_dfstart_dfend_idx
dnafrag_end	INT	-	End of the constrained element	key: mlssid_dfid_dfstart_dfend_idx
dnafrag_strand	TINYINT	-	Strand of the constrained element
method_link_species_set_id	INT	-	External reference to method_link_species_set_id in the method_link_species_set table	key: mlssid_dfid_dfstart_dfend_idx
p_value	double	0	p-value derived from Gerp
score	double	0	Score derived from Gerp

constrained_element_id	dnafrag_id	dnafrag_start	dnafrag_end	dnafrag_strand	method_link_species_set_id	p_value	score
20400000000001	21435779	107336217	107336647	1	2040	2.98057e-105	195.1
20400000000001	21913392	17539835	17540268	-1	2040	2.98057e-105	195.1
20400000000001	25898712	21765526	21765956	-1	2040	2.98057e-105	195.1
20400000000001	26246146	96193877	96194313	-1	2040	2.98057e-105	195.1
20400000000001	29443013	96598222	96598652	-1	2040	2.98057e-105	195.1