SPARQL Endpoint

The Identifiers.org SPARQL endpoint allows for URL resolution and querying registry data.

The Identifiers.org SPARQL endpoint is running at https://sparql.api.identifiers.org/sparql.
A web interface is also available with some examples. The code for this interface is powered by sparql-editor.

Registry data model

Querying registry information via SPARQL allows users to connect the metadata available on these collections to their own knowledge graphs. Information such as descriptions, home pages, providing institution, and url patterns are available through this service.

All registry data was modeled using the VoID and DCAT schemes. New terms were created for attributes that couldn’t be mapped for these schemes. You will find the ontology for these terms here.

This has been implemented using R2RML and ontop. You can find the mappings employed here. Ontop generates triples based on these mappings and the contents of the database. You can find the code for this at our ontop git repository.

VoID mappings

The registry itself is available as a void:Dataset with notations such as void:sparqlEndpoint and void:exampleResource. These can be found here.

DCAT mappings

The registry itself is available as a dcat:Catalog, while namespaces are dcat:Dataset and resources are mapped to dcat:DataService. Namespaces are associated with the catalog via the dcat:dataset property and resources are associated with their namespace via the dcat:servesDataset property.

New terms employed

Terms for some attributes had to be created. See the table bellow for a list of them and their respective new term. These terms are using the idot namespace which is expanded to the http://identifiers.org/idot/ prefix. Formal definitions for these can be found here.

Class of attribute Registry attribute idot term
Both Date of deactivation idot:deprecationDate
Both Approximated expiration date idot:deprecationOfflineDate
Both Deactivation statement idot:deprecationStatement
Both Deactivation flag idot:isDeprecated
Both Legacy registry identifier (MIR ID) idot:mirid
Both Sample local unique identifier idot:sampleID
Namespace Type idot:Namespace
Namespace Local unique identifier pattern idot:luiPattern
Namespace Prefix idot:prefix
Namespace Successor idot:sucessor
Namespace Associated resource idot:isNamespaceOf
Resource Type idot:Resource
Resource Authentication details URL idot:authHelpUrl
Resource Authentication description idot:authHelpDescription
Resource Country code idot:countryCode
Resource Protected URLs flag idot:hasProtectedUrls
Resource Home URL idot:homepage
Resource primary flag idot:isOfficial
Resource Provider code idot:providerCode
Resource URL pattern idot:urlPattern
Resource Associated namespace idot:isResourceOf

Resolving URLs with SPARQL

Similarly to our resolver service, our SPARQL endpoint is capable of converting identifier.org URIs into provider URLs and vice versa. For example: http://identifiers.org/uniprot:P12345 resolves to http://purl.uniprot.org/uniprot/P12345 and vice versa. This was specially developed with semantic data integration in mind, where one often needs to consume heterogeneous datasets which use different types of URIs. This service relies on URI schemes recorded in the Registry. If you find a URI which is not yet listed directly or incorrectly, contact us.

Implementation

URL Resolution is implemented as a virtual triple store using the RDF4J SAIL API. This means that query results are generated on the fly using the Registry’s database content, and we don’t actually have these triples in our dataset.

Queries must match URIs and URLs through the owl:sameAs property. It is meant to be used in federated queries from other endpoints. See the examples below for more details.

The source code for this can be found at https://github.com/identifiers-org/sparql-identifiers. Special thanks to Jerven Bolleman for the contributions to this service.

SPARQL query examples

Run the examples marked with [1] directly on our web frontend at https://sparql.api.identifiers.org. More examples are available there directly.

Run the examples marked with [2] on the Uniprot SPARQL endpoint at https://sparql.uniprot.org/sparql. Beware that these may take a while to respond.

List identifiers.org URIs equivalent to uniprot URL [1].

See in web interface

PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT * WHERE {
  <http://purl.uniprot.org/uniprot/P12345> owl:sameAs ?uris .
}

List uniprot URLs equivalent to identifiers.org URI [1].

See in web interface

PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT * WHERE {
  <http://identifiers.org/uniprot:P12345> owl:sameAs ?uris .
}

List active rhea URLs using the id:active named graph [1].

See in web interface

PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT *
WHERE {
  GRAPH <id:active> {
    <https://identifiers.org/rhea:12345> owl:sameAs ?obj .
  }
} LIMIT 10

Get protein metadata from uniprot based on identifiers.org URI [1].

See in web interface

PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX up: <http://purl.uniprot.org/core/>

SELECT * WHERE {
  <http://identifiers.org/uniprot:P12345> owl:sameAs ?protein .
  SERVICE <https://sparql.uniprot.org/sparql> {
    ?protein a up:Protein ;
             up:encodedBy [ skos:prefLabel ?name ] .
  }
}
LIMIT 1

List homo-sapiens proteins with their uniprot URI and alternative URI [2].

PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX taxon: <http://purl.uniprot.org/taxonomy/>
PREFIX up: <http://purl.uniprot.org/core/>
SELECT DISTINCT * WHERE {
  ?protein a up:Protein ;
               up:organism taxon:9606 .
  SERVICE <http://sparql.api.identifiers.org/sparql> {
      ?protein owl:sameAs ?proteinAlt .
  }
}
LIMIT 1

Find proteins that have annotations to human entries that are known to be involved in a disease with alternative URIs [2].

PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX taxon: <http://purl.uniprot.org/taxonomy/>
PREFIX up: <http://purl.uniprot.org/core/>

SELECT * WHERE {
  ?protein a up:Protein ;
           up:organism taxon:9606 ;
           up:encodedBy [ skos:prefLabel ?name ] ;
           up:annotation ?annotation .

  ?annotation a up:Disease_Annotation ;
              rdfs:comment ?text .

  SERVICE <http://sparql.api.identifiers.org/sparql> {
      ?protein owl:sameAs ?proteinAlt .
  }
}
LIMIT 1