Debugging Ensembl API Connections

For most users installing Ensembl is a painless exercise. However computer setups are varied and sometimes issues will arise. This guide exists to help you debug and identify the root cause of a connection problem. To summarise, most issues are caused by

  • Not installing the Ensembl required packages e.g. BioPerl, DBI and DBD::mysql
  • Incorrect API usage
  • Unable to access port 3306 due to local firewall restrictions
  • Scheduled Ensembl database maintenance

The following steps will not solve every issue but they will help diagnose these issues. This guide assumes familiarity with the *nix command line (such as bash).

1). Use ensembl/misc-scripts/ping_ensembl.pl

ping_ensembl.pl is our first port of call when debugging Ensembl connections and is distributed with the core API. It attempts to connect to one of the public MySQL databases (of your choosing) and tries to retrieve a human core DBAdaptor. The script will also try to diagnose common mistakes in setting up Ensembl. You should run the command line most appropriate to your situation:

# To ping our UK server use
> perl ensembl/misc-scripts/ping_ensembl.pl

# To ping our USEast server use
> perl ensembl/misc-scripts/ping_ensembl.pl -ue

# To ping Ensembl Genomes use
> perl ensembl/misc-scripts/ping_ensembl.pl -eg

# To ping our UK server but with a different species
> perl ensembl/misc-scripts/ping_ensembl.pl -species pig

Any issues with missing Perl library dependencies will be flagged here. Common missing modules include:

  • DBI
  • DBD::mysql

Both of these libraries must be installed using your Perl distribution's installer. If you have built your own Perl then use CPAN/cpanminus. Linux users running system Perl can use a Linux distribution installer such as yum (Fedora, RHEL, Centos) and apt (Ubuntu). Other common issues include installing our unreleased API and not adding Ensembl libraries to your PERL5LIB. For both ping_ensembl.pl will advise on the best solution.

2). Are you Unable to Access Comparative, Variation or Regulation Data?

Ensembl is composed of multiple code projects based upon their data responsibility. This roughly equates to

  • ensembl = genes, transcripts, translations, assembly, sequence
  • ensembl-variation = SNVs, CNVs, somatic variations, phenotypes
  • ensembl-compara = gene trees, homologies, multiple and pairwise genomic alignments
  • ensembl-funcgen = regulation, motifs, array probes

You cannot access a set of data unless you have the correct libraries on your PERL5LIB. The previous step ping_ensembl will have warned you about the lack of Perl libraries on your PERL5LIB. If you require this data then install the library and put it onto your PERL5LIB.

3). Are you loading databases from the Registry?

The Registry has the option to print more information about what it is loading. Try using the following command and see if your species/data set has been loaded

Bio::EnsEMBL::Registry->load_registry_from_db( 
  -HOST => 'ensembldb.ensembl.org', 
  -USER => 'anonymous', 
  -VERBOSE => 1 
);

Scan the output and try to find your required species.

4). Try Connecting to Ensembl Using a MySQL Client

If you can connect to our MySQL server using a client then the issue should be in your Perl or Ensembl setup.

mysql --host=ensembldb.ensembl.org --port=3306 --user=anonymous

  Welcome to the MySQL monitor.  Commands end with ; or \g.
  Your MySQL connection id is 4292641
  Server version: 5.1.34-log Source distribution

  Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.

  Oracle is a registered trademark of Oracle Corporation and/or its
  affiliates. Other names may be trademarks of their respective
  owners.

  Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

  mysql>

You can also try using GUI alternatives such as Sequel Pro (OSX), HeidiSQL (Windows) or SquirrelSQL (multi-platform Java application).

5). Try Connecting to Another MySQL Server

If you cannot connect to our MySQL server then try another. Ensembl Genomes is available at mysql-eg-publicsql.ebi.ac.uk (port 4157 and user anonymous) and UCSC at genome-mysql.cse.ucsc.edu (port 3306 and user genome). Failure to connect to these servers indicates a systemic problem and not one unique to Ensembl.

6). Check for Scheduled Downtime

Sometimes downtime is unavoidable. When this occurs we will inform users which services will be unavailable via our blog and mailing lists.

7). Pinging the MySQL Server

The command ping (do not confuse with ping_ensembl.pl from step 1) issues packets to the DB servers. It assess if the address resolves to an IP address and that machine is available. USEast DB does not respond to pings.

> ping -c 5 ensembldb.ensembl.org

  PING ensembldb.sanger.ac.uk (XXX.XXX.XXX.XXX): 56 data bytes
  64 bytes from XXX.XXX.XXX.XXX: icmp_seq=0 ttl=59 time=1.600 ms
  64 bytes from XXX.XXX.XXX.XXX: icmp_seq=1 ttl=59 time=1.646 ms
  64 bytes from XXX.XXX.XXX.XXX: icmp_seq=2 ttl=59 time=1.724 ms
  64 bytes from XXX.XXX.XXX.XXX: icmp_seq=3 ttl=59 time=1.924 ms
  64 bytes from XXX.XXX.XXX.XXX: icmp_seq=4 ttl=59 time=1.374 ms

  --- ensembldb.sanger.ac.uk ping statistics ---
  5 packets transmitted, 5 packets received, 0.0% packet loss
  round-trip min/avg/max/stddev = 1.374/1.654/1.924/0.178 ms

A healthy ping with no packet loss indicates a good connection to our SQL server.

8). Check with your Network Administrator if Port 3306 is Open

If you can ping Ensembl but still cannot connect check your local port settings. Some networks do not allow connection to MySQL servers by default. Check with your server network administrator that 3306 is open. If you are connecting to Ensembl Genomes check that port 4157 is open.

9). Search for your Error

It is sometimes possible that the error you are seeing is due to a component other than Ensembl. Try your favourite search engine (such as Google or Bing) and see what the internet can tell you. Sites like Stack Overflow and MySQL's bug tracking system are invaluable resources. Be sure to use pertinent portions of error messages e.g. assuming the following error message

Installation is good. Connection to Ensembl works and you can query the human core database
Error in my_thread_global_end(): 1 threads didn't exit

Search for "my_thread_global_end" and "threads" rather than the entire script output. This particular issue is due to a MySQL library bug on Windows machines and not an issue directly related to Ensembl.

10). Try Re-installing your Setup or Talk to your Local System Administrator

Sometimes this works; sometimes it does not. However setups can become corrupted or out of date. A re-installation of later versions of software can help including re-installing Ensembl. If you are on a platform with systems support or a business contract then contact them for help. Some problems cannot be solved unless you talk to these people.

Windows users may have more success using Strawberry Perl over ActiveState's Active Perl.

11). Email Helpdesk

Helpdesk and the Ensembl team are always available to help you to access our datasets and APIs. You should send the output of the above commands and steps with your problem as it will speed up the support process. Please be aware that our only supported platform for deployment is Linux (we also have some OSX experience) though we will try our best to help out on other platforms. Please give us as much information about your OS, architecture (e.g. 64bit) and all command line output from the following commands plus the previous steps (especially steps 1, 3, 4 and 5):

> perl -V
> perl -MDBI -e 'warn $DBI::VERSION'
> perl -MDBD::mysql -e 'warn $DBD::mysql::VERSION'