Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RDF-star for Wikibase/Wikidata #24

Open
Tpt opened this issue Sep 12, 2023 · 5 comments
Open

RDF-star for Wikibase/Wikidata #24

Tpt opened this issue Sep 12, 2023 · 5 comments
Labels
use case Issue to record discussion on a use case

Comments

@Tpt
Copy link

Tpt commented Sep 12, 2023

See https://github.com/w3c/rdf-ucr/wiki/RDF%E2%80%90star-for-Wikidata for a single document on this use case.

Contact information:

  • Your name: @Tpt
  • How to contact you: Contacts in my GitHub profile + Tpt on W3C IRC

Note that I am not representing Wikibase/Wikidata/Wikimedia in any way, I just wanted to describe this use case.

Brief Description of your use case:

Wikibase is the sofware that powers Wikidata. Wikibase is using its own data model but provides a RDF mapping. Wikibase contains a native reification system. Each main "snak" (aka triple) like "USA president JoeBiden" can be annotated with "qualifiers" like "start date January 20th 2021" or "predecessor DonaldTrump", "references" (i.e. blank nodes describing a source) and a "rank" (a processing annotation that can have three values "preferred"/"normal"/"deprecated"). Wikibase calls this full construction a "statement".

The current RDF encoding uses a specific RDF node to encode each statement. For example (Wikibase uses opaque identifiers, I have tweaked the RDF to make it more readable):

wd:USA a wikibase:Item ;
    p:president wd:JoeBidenPresidencyStatement wd:DonaldTrumpPresidencyStatement . # p:X are relations between a subject and a statement. The statement subject is the triple subject (here "USA) and the statement predicate is the relation predicate (here "president")

wds:JoeBidenPresidencyStatement a wikibase:Statement  ;
     ps:president wd:JoeBiden ; # ps:X are relations between a statement and an object. The statement object is the triple object (here "JoeBiden") and the statement predicate is the relation predicate (here "president")
     wikibase:rank wikibase:PreferredRank ;
     pq:start_date "2021-01-20"^^xsd:dateTime ; # A qualifier
     pq:predecessor wd:DonaldTrump ; # A qualifier
     prov:wasDerivedFrom wdref:a_reference , wdref:an_other_reference .

wds:DonaldTrumpPresidencyStatement a wikibase:Statement  ;
     ps:president wd:DonaldTrump ;
     wikibase:rank wikibase:NormalRank ;
     pq:start_date "2017-01-20"^^xsd:dateTime ;
     pq:start_date "2021-01-20"^^xsd:dateTime .

wd:USA wdt:president wd:JoeBien . # For statements with the "best" rank a direct edges is inserted in the RDF with the "wdt:" prefix.

Note that in the previous example the wd:USA wdt:president wd:JoeBien direct triple have been generated because the statement rank is "preferred". Statements about the older presidencies also exists but have only the "normal" rank such that the direct triples are not generated.

Paper about Wikibase RDF encoding design Reifying RDF: What Works Well With Wikidata?

What you want to be able to do:

It would be great to provide a way to have nice RDF syntax to encode this use cases.

What is the role of RDF-star quoted triples in your use case:

They might be used to simplify the RDF encoding. For example one might hope to write:

<< wd:USA wd:president wd:JoeBiden >>  a wikibase:Statement  ;
     wikibase:rank wikibase:PreferredRank ;
     pq:start_date "2021-01-20"^^xsd:dateTime ;
     pq:predecessor wd:DonaldTrump ;
     prov:wasDerivedFrom wdref:a_reference , wdref:an_other_reference .

<< wd:USA wd:president wd:DonaldTrump >>  a wikibase:Statement  ;
     wikibase:rank wikibase:NormalRank ;
     pq:start_date "2017-01-20"^^xsd:dateTime ;
     pq:start_date "2021-01-20"^^xsd:dateTime .

wd:USA wd:president wd:JoeBien .

Why it is hard or impossible to do what you want to do without quoted triples:

Wikidata needs reification to encode statements.

How you want quoted triples to behave in your use case:

(For example, do you want the precise syntax of subjects, predictes, and objects in quoted triples to be important?)

The RDF-star encoding written above is only valid if the existance of a quoted triple does not implies the assertion of the triple itself. Indeed we would like this to be in Wikidata RDF graph:

<< wd:USA wd:president wd:DonaldTrump >>  a wikibase:Statement  ;
     wikibase:rank wikibase:NormalRank ;
     pq:start_date "2017-01-20"^^xsd:dateTime ;
     pq:end_date "2021-01-20"^^xsd:dateTime .

But the triple wd:USA wd:president wd:DonaldTrump should not be in the graph.

We also need to be able to distinguish two statements on the same base triple. We can't merge the following two statements because it would make the start date, end date pairs meaningless:

<< wd:Russia wd:president wd:VladimirPutin >>  a wikibase:Statement  ;
     wikibase:rank wikibase:NormalRank ;
     pq:start_date "1999-12-31"^^xsd:dateTime ;
     pq:end_date "2008-05-07"^^xsd:dateTime .

<< wd:Russia wd:president wd:VladimirPutin >>  a wikibase:Statement  ;
     wikibase:rank wikibase:PreferredRank ;
     pq:start_date "2012-05-07"^^xsd:dateTime ;

An example RDF graph that shows part of your use case:

The Wikidata graph exposed by the Wikidata Query Service.

@Tpt Tpt added the use case Issue to record discussion on a use case label Sep 12, 2023
@pfps
Copy link
Contributor

pfps commented Sep 12, 2023

One question is how much of Wikidata to include. As you say, there are ranks but there are also no-values and some-values.

@niklasl
Copy link

niklasl commented Sep 12, 2023

This is a good case, similarly considered in the Wikidata example of Detailed Provenance in Cooperative Union Cataloguing .

I don't think opacity is necessary though? That the triple is unasserted in the graph doesn't necessarily preclude that its suggested sense is invisible to entailment? But I presume the question here is what would the common intersection of semantic expectations by Wikidata editors and consumers be?

(Note: I've experimented a bit with making Wikidata RDF "more readable" for these kinds of illustrative examples. See: https://github.com/Kungbib/wikidatalab/ )

@Tpt
Copy link
Author

Tpt commented Sep 12, 2023

@pfps Thank you!

One question is how much of Wikidata to include. As you say, there are ranks but there are also no-values and some-values.

I believe some/no values are not affecting the expected semantic with respect to RDF-star, opposite to ranks. Indeed some-value can be written << wd:s wd:p _:my_blank_node >> and no-value << wd:s wd:p wdno:p>> with wdno:p defined elsewhere as:

 wdno:p a owl:Class ;
    owl:complementOf [ a owl:Restriction ; owl:onProperty wdt:p ; owl:someValuesFrom owl:Thing ] .

My initial use case was missing the multiple statements on the same base triple problem. I have updated the "How you want quoted triples to behave in your use case" section to reflect it.

@Tpt
Copy link
Author

Tpt commented Sep 12, 2023

@niklasl

This is a good case, similarly considered in the Wikidata example of Detailed Provenance in Cooperative Union Cataloguing .

Thank you! I wanted to cover Wikidata/Wikibase as a stand alone usecase to be able to fully describe its needs.

I don't think opacity is necessary though? That the triple is unasserted in the graph doesn't necessarily preclude that its suggested sense is invisible to entailment? But I presume the question here is what would the common intersection of semantic expectations by Wikidata editors and consumers be?

Sorry, I got confused by the namings and thaught that "referential transparency" was about the quoted triple stand-alone assertion. Indeed, referential transparency seems to work well with Wikidata (and probably better than opacity). I have updated the use case description.

(Note: I've experimented a bit with making Wikidata RDF "more readable" for these kinds of illustrative examples. See: https://github.com/Kungbib/wikidatalab/ )

Nice!

@pfps
Copy link
Contributor

pfps commented Sep 12, 2023

I created a wiki page to hold a clean version of this use case. See https://github.com/w3c/rdf-ucr/wiki/RDF%E2%80%90star-for-Wikidata

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
use case Issue to record discussion on a use case
Projects
None yet
Development

No branches or pull requests

3 participants