You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Provide sufficient information so that a member of the working group's Use Case Task Force can contact you and enhance your description so that it can be used by the working group to guide their activities. You do not have to fill out all the information requested.
As an aggregator of RDF information, I want to have a predictable number of triples when parsing triples where literals may vary only in the case of the language tag element. I would also like the serialized (possibly canonicalized) form to use the BCP14 formatting recommendations, so that the language tag en-us might canonically be represented as en-US.
[ISO639-1] recommends that language codes be written in lowercase ('mn' Mongolian).
[ISO15924] recommends that script codes use lowercase with the initial letter capitalized ('Cyrl' Cyrillic).
[ISO3166-1] recommends that country codes be capitalized ('MN' Mongolia).
When aggregating data, input can be combined from different documents, where different conventions of formatting language tags are in use, leading the potential duplication of data.
*** What you want to be able to do:
When parsing a document that may be composed of several overlapping triples, I would like the resulting graph to have a unique abstract representation for otherwise equal language tags. As it is, the following Turtle can generate either one or two triples in the abstract representation, depending on if the implementation chooses to normalize language tags, e.g., to lower case.
_:a rdf:value "foo"@en-us, "foo"@en-US .
Implementations that normalize language tags will result in a single triple, those that do not will result in two triples.
*** What is the role of RDF-star quoted triples in your use case:
Not related to quoted triples.
*** Why it is hard or impossible to do what you want to do without quoted triples:
Not related to quoted triples.
*** How you want quoted triples to behave in your use case:
(For example, do you want the precise syntax of subjects, predictes, and objects in quoted triples to be important?)
From the start, RDF should have mandated a normalized form for language tags in literals, ideally based on BCP47 formatting. It would also be acceptable if all parsers normalized language tags to lower case for the abstract representation. Concrete syntaxes which can perform canonicalization could then require a particular form for language tags without danger of potentially serializing different graphs, depending on how they were parsed on input.
*** An example RDF graph that shows part of your use case:
_:a rdf:value "foo"@en-us, "foo"@en-US .
If changed to require normalizing to lower case, this would be the same as the following:
_:a rdf:value "foo"@en-us .
N-Triples/N-Quads canonicalization could then either represent using that lower case form, or use BCP47 formatting.
The text was updated successfully, but these errors were encountered:
This use case for RDF 1.2 places constraints on how language tags are handled. As it doesn't have implications for the RDF-star semantics it can be just tracked here, without creating a wiki page for it.
Provide sufficient information so that a member of the working group's Use Case Task Force can contact you and enhance your description so that it can be used by the working group to guide their activities. You do not have to fill out all the information requested.
** Contact information
** Brief Description of your use case:
As an aggregator of RDF information, I want to have a predictable number of triples when parsing triples where literals may vary only in the case of the language tag element. I would also like the serialized (possibly canonicalized) form to use the BCP14 formatting recommendations, so that the language tag
en-us
might canonically be represented asen-US
.When aggregating data, input can be combined from different documents, where different conventions of formatting language tags are in use, leading the potential duplication of data.
*** What you want to be able to do:
When parsing a document that may be composed of several overlapping triples, I would like the resulting graph to have a unique abstract representation for otherwise equal language tags. As it is, the following Turtle can generate either one or two triples in the abstract representation, depending on if the implementation chooses to normalize language tags, e.g., to lower case.
Implementations that normalize language tags will result in a single triple, those that do not will result in two triples.
*** What is the role of RDF-star quoted triples in your use case:
Not related to quoted triples.
*** Why it is hard or impossible to do what you want to do without quoted triples:
Not related to quoted triples.
*** How you want quoted triples to behave in your use case:
(For example, do you want the precise syntax of subjects, predictes, and objects in quoted triples to be important?)
From the start, RDF should have mandated a normalized form for language tags in literals, ideally based on BCP47 formatting. It would also be acceptable if all parsers normalized language tags to lower case for the abstract representation. Concrete syntaxes which can perform canonicalization could then require a particular form for language tags without danger of potentially serializing different graphs, depending on how they were parsed on input.
*** An example RDF graph that shows part of your use case:
If changed to require normalizing to lower case, this would be the same as the following:
N-Triples/N-Quads canonicalization could then either represent using that lower case form, or use BCP47 formatting.
The text was updated successfully, but these errors were encountered: