New source cross-references in release 2017.5

(minor updates 15 Sep 2017)

The statistics of content are presented as usual in the release notes and The Guide to IMMUNOPHARMACOLOGY has a separate update.  This post describes changes and updates to other resources we provide links to, that have been introduced in this release cycle.  More detail will be provided in the help pages (and feedback on any of them is welcome) but the outlines are as follows;

Extra links for ligands. The new connectivity applies to those that have chemical structures (i.e. SMILES strings for mostly small molecules but also peptides up to ~ 50 to 60 residues and a few oligonucleotide drugs), which represents 6821 ligands in GtoPdb. Links have now been rationalised by introducing InChIKey call-outs to UniChem at the EBI.  This resource, currently containing over 150 million indexed chemical structures from 37 sources (including our own), many of which we had hitherto individually curated links for.  In essence, UniChem “looks after” comprehensive cross-mappings between these sources via a regular and precise automated process. We can consequently rely on presenting these links for our own entries. This is because we have selected and curatorially checked (i.e. locked-down) our own structural assignments, including for our PubChem submissions.  By clicking the UniChem link users can now quickly navigate to complementary sources  such as  DrugBank, ChEBI, HMDB, BindingDB, ChEMBL, PDBe SureChEMBL (patents) and others. Note this is analogous to the Google InChIKey call out we already introduced for our ligands some time ago. There is some overlap in the result sets but note the Google search will find different chemistry sources (including ChemSpider entries, usualy uppermost in the Google rankings) that are not currently indexed by UniChem.

The Human Protein Atlas (HPA) team have increased their profile recently,  not only by becoming one of the European ELIXIR core resources but also because of a major new extension in the form of a Pathology Atlas with a focus on human cancer.  We have also had contacts with the team.  Consequently,  we selected this as as a new outlink from our human protein entries (2839 target links and 353 ligand links) as an excellent first-stop shop for tissue and cell line expression patterns as well as intracellular distributions.  In terms of utility it is important to note that HPA offers the best of both worlds by integrating three sources of high-throughput mRNA transcript profiling in addition to direct antibody detection of the protein.

CATH/Gene3D. As you may have been noticed we have increased our protein structure connectivity in 2017 including our SynPharm drug-responsive protein sequences resource (see below).  There are many user utilities for the increase in structural data, including the impressive acceleration of ligand binding sites resolved in new GPCR structures. CATH is a classification of PDB protein structures grouped by protein domains into superfamilies that have diverged from a common ancestor. Users are encouraged to take a look at the  features of CATH for their own exploitation. These include tracking the deep phylogeny of pharmacological targets (that have structures) where this is difficult to detect on the basis of sequence similarity alone. The current version of GtoPdb includes 1634 target links to CATH (which is lower than the total because not all protein families have 3D structural representation, yet), and 230 peptide ligands  (Sep 2017 update CATH is also now European ELIXIR core resources).

synPHARM was originally set-up to provide synthetic biologists with tools to discover sequences that could be modulated by known ligands from GtoPdb which could be transferred to synthetic proteins in order to confer drug control. synPHARM combines structural information from the Protein Data Bank with information on ligand binding from GtoPdb to produce a database of ligand binding sequences. As such, it is a useful resource for 3D ligand binding information. We have now added links from GtoPdb target and ligand pages to structures in synPHARM.

IUPHAR Pharmacology Education Project (PEP). PEP is a new IUPHAR initiative to provide free access to education and training resources in pharmacology. We have added links from 673 ligand and drug pages to background information in PEP, for further information on drug action and clinical use.

RCSB Protein Data Bank (RCSB PDB). Although not strictly new, it’s worth pointing out that the current rate of reporting new structures of ligands bound to targets means the number of links to the PDB via ligand entries has increased significantly over recent releases. The number of our PDB ligand links now stands at 1337, based on exact InChIKey matches. In addition, many of the GtoPdb ligands are represented in the PDB as alternative isomeric forms. Note also there are occasions where the PubChem MMDB CID assignment does not exactly match the PDB ligand structure.  In both these cases we add cross-pointers in the ligand comment sections.

Tagged with: , ,
Posted in Database updates, Technical
One comment on “New source cross-references in release 2017.5
  1. […] From time to time we internaly review the databases that we cross-link to and from, to make sure they are current and useful. During an iteration of this process within this release cycle we introduced several new and existing resources that have value for users. These changes are explained in a separate post. […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: