MySQL relational database and other options for download

Posted on December 11, 2014 by guidetopharmacology — Leave a comment

As mentioned in various places (including our FAQ) we welcome the downloading of our content in various slices and formats (listed here). Please contact us if you do this, not only out of professional courtesy but also so we can get feedback on any technical issues and/or suggested enhancements. In addition, this presents the opportunity to engage in occasional dialogue with data sources we do not yet have personal contact with.

The Guide to PHARMACOLOGY (GtoPdb) data are stored in a PostgreSQL relational database on a Linux server. For the past several releases we have made a SQL dump file available for download on our website. We have had a few requests to provide the data in MySQL format so we have produced a test MySQL version of the database migrated from PostgreSQL to MySQL. This version was created using MySQL Community Server version 5.6 on Windows and the migration was done with MySQL Workbench 6.2.

We haven’t tested the MySQL version so if you use it and find any problems please let us know by email. The PostgreSQL version is our working database; it is therefore potentially more stable than the MySQL version. If you have no technical requirement to choose one over the other, we’d recommend you stick to the PostgreSQL version. This also includes the customised text search indexes used on our website.

Note that these data are encoded in UTF-8; to use it properly with MySQL you will need to enable full UTF-8 4-byte support using the character set utf8mb4. Here is a useful post about how to do this. If you use another MySQL character set such as utf8 you may get errors with, for example, Greek characters and other symbols.

We realise that the table relationships are complex; we’re working to tidy them up our end and hope to release a new version with an annotated ERD in early 2015.

If you want a simpler slice of just the small-molecule structures and the URL pointers to the database entries (for example to integrate into a local chemistry database) the ligands.csv file may be the best choice. We do not hold SDF files but you can use the isomeric SMILES or the InChI to generate these. You may also choose to drop the rows without SMILES as these are mostly peptides and antibodies. As ever, please let us know how you get on with local integration.