SPARQL Endpoint
The Identifiers.org SPARQL endpoint allows for URL resolution and querying registry data.
The Identifiers.org SPARQL endpoint is running at https://sparql.api.identifiers.org/sparql.
A web interface is also available with some examples. The code for this interface is powered by sparql-editor.
Registry data model
Querying registry information via SPARQL allows users to connect the metadata available on these collections to their own knowledge graphs. Information such as descriptions, home pages, providing institution, and url patterns are available through this service.
All registry data was modeled using the VoID and DCAT schemes. New terms were created for attributes that couldn’t be mapped for these schemes. You will find the ontology for these terms here.
This has been implemented using R2RML and ontop. You can find the mappings employed here. Ontop generates triples based on these mappings and the contents of the database. You can find the code for this at our ontop git repository.
VoID mappings
The registry itself is available as a void:Dataset
with notations such as void:sparqlEndpoint
and void:exampleResource
. These can be found here.
DCAT mappings
The registry itself is available as a dcat:Catalog
, while namespaces are dcat:Dataset
and resources are mapped to dcat:DataService
. Namespaces are associated with the catalog via the dcat:dataset
property and resources are associated with their namespace via the dcat:servesDataset
property.
New terms employed
Terms for some attributes had to be created. See the table bellow for a list of them and their respective new term. These terms are using the idot
namespace which is expanded to the http://identifiers.org/idot/
prefix. Formal definitions for these can be found here.
Class of attribute | Registry attribute | idot term |
---|---|---|
Both | Date of deactivation | idot:deprecationDate |
Both | Approximated expiration date | idot:deprecationOfflineDate |
Both | Deactivation statement | idot:deprecationStatement |
Both | Deactivation flag | idot:isDeprecated |
Both | Legacy registry identifier (MIR ID) | idot:mirid |
Both | Sample local unique identifier | idot:sampleID |
Namespace | Type | idot:Namespace |
Namespace | Local unique identifier pattern | idot:luiPattern |
Namespace | Prefix | idot:prefix |
Namespace | Successor | idot:sucessor |
Namespace | Associated resource | idot:isNamespaceOf |
Resource | Type | idot:Resource |
Resource | Authentication details URL | idot:authHelpUrl |
Resource | Authentication description | idot:authHelpDescription |
Resource | Country code | idot:countryCode |
Resource | Protected URLs flag | idot:hasProtectedUrls |
Resource | Home URL | idot:homepage |
Resource | primary flag | idot:isOfficial |
Resource | Provider code | idot:providerCode |
Resource | URL pattern | idot:urlPattern |
Resource | Associated namespace | idot:isResourceOf |
Resolving URLs with SPARQL
Similarly to our resolver service, our SPARQL endpoint is capable of converting identifier.org URIs into provider URLs and vice versa. For example: http://identifiers.org/uniprot:P12345
resolves to http://purl.uniprot.org/uniprot/P12345
and vice versa. This was specially developed with semantic data integration in mind, where one often needs to consume heterogeneous datasets which use different types of URIs. This service relies on URI schemes recorded in the Registry. If you find a URI which is not yet listed directly or incorrectly, contact us.
Implementation
URL Resolution is implemented as a virtual triple store using the RDF4J SAIL API. This means that query results are generated on the fly using the Registry’s database content, and we don’t actually have these triples in our dataset.
Queries must match URIs and URLs through the owl:sameAs
property. It is meant to be used in federated queries from other endpoints. See the examples below for more details.
The source code for this can be found at https://github.com/identifiers-org/sparql-identifiers. Special thanks to Jerven Bolleman for the contributions to this service.
SPARQL query examples
Run the examples marked with [1] directly on our web frontend at https://sparql.api.identifiers.org. More examples are available there directly.
Run the examples marked with [2] on the Uniprot SPARQL endpoint at https://sparql.uniprot.org/sparql. Beware that these may take a while to respond.
List identifiers.org URIs equivalent to uniprot URL [1].
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT * WHERE {
<http://purl.uniprot.org/uniprot/P12345> owl:sameAs ?uris .
}
List uniprot URLs equivalent to identifiers.org URI [1].
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT * WHERE {
<http://identifiers.org/uniprot:P12345> owl:sameAs ?uris .
}
List active rhea URLs using the id:active
named graph [1].
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT *
WHERE {
GRAPH <id:active> {
<https://identifiers.org/rhea:12345> owl:sameAs ?obj .
}
} LIMIT 10
Get protein metadata from uniprot based on identifiers.org URI [1].
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX up: <http://purl.uniprot.org/core/>
SELECT * WHERE {
<http://identifiers.org/uniprot:P12345> owl:sameAs ?protein .
SERVICE <https://sparql.uniprot.org/sparql> {
?protein a up:Protein ;
up:encodedBy [ skos:prefLabel ?name ] .
}
}
LIMIT 1
List homo-sapiens proteins with their uniprot URI and alternative URI [2].
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX taxon: <http://purl.uniprot.org/taxonomy/>
PREFIX up: <http://purl.uniprot.org/core/>
SELECT DISTINCT * WHERE {
?protein a up:Protein ;
up:organism taxon:9606 .
SERVICE <http://sparql.api.identifiers.org/sparql> {
?protein owl:sameAs ?proteinAlt .
}
}
LIMIT 1
Find proteins that have annotations to human entries that are known to be involved in a disease with alternative URIs [2].
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX taxon: <http://purl.uniprot.org/taxonomy/>
PREFIX up: <http://purl.uniprot.org/core/>
SELECT * WHERE {
?protein a up:Protein ;
up:organism taxon:9606 ;
up:encodedBy [ skos:prefLabel ?name ] ;
up:annotation ?annotation .
?annotation a up:Disease_Annotation ;
rdfs:comment ?text .
SERVICE <http://sparql.api.identifiers.org/sparql> {
?protein owl:sameAs ?proteinAlt .
}
}
LIMIT 1