Metadata Service

Acquisition of provider page annotations

Identifiers.org metadata service enables users to extract Schema.org from landing pages of the original providers by passing in Compact Identifiers.

http://metadata.api.identifiers.org/{Compact Identifier}

For example: http://metadata.api.identifiers.org/reactome:R-HSA-446203

How it works

Our backend resolves the compact identifier to find the URLs to query
For each URL, it loads its content and search for JSON-LD script tags
- Xpath query used on the loaded HTML: //script[@type='application/ld+json']
If multiple providers have this content available, the recommendation index from the resolver API is used to pick one.

The source code can be found here.

Resources providing metadata

Following is a list of resources in the Identifiers.org registry providing metadata (last updated 2018-12-05).

ec-code, reactome, prosite, cath.domain, hamap, biosample, fairsharing, cellosaurus, cosmic, mobidb, hpscreg, lei, biomodels.db, pdb, sgd, wb, fb, arrayexpress, mgi, rgd, zfin, narcis, gxa.expt, metabolights, rgd.qtl, rgd.strain, ega.study, ega.dataset, pride.project, lincs.data, mw.study, mex, gpmdb, and kaggle

Acquisition of metadata from other providers

Following our recent participation on the 3rd German BioHackathon, we have expanded our metadata service to collect information from other metadata providing services. This is implemented by retriever components that use the APIs from these services to acquire information on compact identifiers. The retrievers enabled and the data collected differs based on the namespace of the compact identifier.

This is used in our resolution page to display metadata on resolved compact identifiers.

This feature is a work in progress. It may be modified or removed as necessary without proper warning. If you are interested in it or already using it, please let us know.

Retriever endpoints

The main endpoint for the retriever API follows the pattern

https://metadata.api.identifiers.org/retrievers/{Compact Identifier}

This endpoint lists the available retriever endpoints for that compact identifier. It is expected to be queried first discover which retrievers can contain information on that compact identifier. The response will look similarly to:

{
   "apiVersion": "1.0",
   "errorMessage": null,
   "payload": {
      "parsedCompactIdentifier": {
         // Same values from resolver API 
      },
      "ableRetrievers": [
         "https://metadata.api.identifiers.org/retrievers/{Retriever 1}/{Compact Identifier}",
         "https://metadata.api.identifiers.org/retrievers/{Retriever 2}/{Compact Identifier}",
         //...
      ]
   }
}

Then, each URL under .payload.ableRetrivers will query different metadata providers for information and answer with a set of label -> list of values pairs representing the parsed metadata from that provider. The response of each will look similar to:

{
    "label1": [
        "value1",
        "value2",
        "value3"
    ],
    "label2": [
        "value4"
    ]
}

To acquire the raw data from providers, the user may use a URL in the format:

https://metadata.api.identifiers.org/retrievers/{Retriever 2}/raw/{Compact Identifier}

Retriever implementation

At this time (Jan 23rd 2025), only two data retrievers are implemented:

EBI Search, the search engine that incorporates EBI resources in addition to collaborator resources.
TogoID, an ID conversion service implementing unique features with an intuitive web interface and an API for programmatic access.

If you are interested in contributing to this list, please reach out to us.

Identifiers.org