@hydrofoil/shape-to-query Explanations

Living Document,

This version:
https://shape-to-query.hypermedia.app/explanations/
Issue Tracking:
GitHub
Editor:
Tomasz Pluskiewicz

Abstract

In-depth discussion about shape-to-query libarary

1. Explanations

1.1. Constraints vs filters

1.1.1. Semantics of constraints

There is an important, and possibly counter-intuitive, quality to SHACL constraints that when applied, they eliminate entire focus nodes from the result.

For example, consider the shape and query below which use sh:languageIn/FILTER LANG to constrain a property.

Does this find organizations and their products with English names?
PREFIX ex: <http://example.org/>
PREFIX schema: <http://schema.org/>
PREFIX sh: <http://www.w3.org/ns/shacl#>
PREFIX sparql: <http://datashapes.org/sparql#>

[
  a sh:NodeShape ;
  sh:targetClass schema:Organization ;
  sh:property
    [
      sh:path schema:produces ;
      sh:node
        [
          sh:property
            [
              sh:path schema:name ;
              sh:languageIn ( "en" ) ;
            ]
        ] ;
    ] ;
] .
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX schema: <http://schema.org/>
CONSTRUCT {
  ?resource1 rdf:type schema:Organization.
  ?resource1 schema:produces ?resource2.
  ?resource2 schema:name ?resource3.
}
WHERE {
  {
    SELECT ?resource1 ?resource2 WHERE {
      ?resource1 rdf:type schema:Organization;
        schema:produces ?resource2.
      {
        ?resource2 schema:name ?resource3.
        FILTER((LANG(?resource3)) = "en")
      }
    }
  }
  UNION
  {
    SELECT ?resource2 ?resource3 WHERE {
      ?resource1 rdf:type schema:Organization;
        schema:produces ?resource2.
      ?resource2 schema:name ?resource3.
      {
        ?resource2 schema:name ?resource3.
        FILTER((LANG(?resource3)) = "en")
      }
    }
  }
}

You will note that the constraints are applied outside the UNION. Thus, they are applied to the entire result. In practice, this means that if an organisation does not produce any products, or their products do not have English labels, it will not be included in the result. This is likely not the desired effect.

This limitation is consequence of how the @hydrofoil/shape-to-query chooses to interpret the SHACL semantics. An schema:Organizaion resource will not satisfy the constraints if any of its products lack an English name. You can see that in this SHACL Playground link.

Note: The library deviates from the spec by being too strict in also eliminating organisation whose products do not have any labels at all but that is a different subject.

1.1.2. Filter to the rescue

In order to filter on the values of the schema:name property of individual products, a combination of Property Value Rule and Filter Expression must be used.

Filtering only the values of products' names
PREFIX ex: <http://example.org/>
PREFIX schema: <http://schema.org/>
PREFIX sh: <http://www.w3.org/ns/shacl#>
PREFIX sparql: <http://datashapes.org/sparql#>

[
  a sh:NodeShape ;
  sh:targetClass schema:Organization ;
  sh:property
    [
      sh:path schema:produces ;
      sh:node
        [
          sh:property
            [
              # Populate the schema:name property
              sh:path schema:name ;
              sh:values
                [
                  # by selecting values of schema:name
                  sh:nodes [ sh:path schema:name ] ;
                  # but only those which are tagged as English
                  sh:filterShape
                    [
                      sh:languageIn ( "en" ) ;
                    ] ;
                ]
            ]
        ] ;
    ] ;
] .
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX schema: <http://schema.org/>
CONSTRUCT {
  ?resource1 rdf:type schema:Organization.
  ?resource1 schema:produces ?resource2.
  ?resource2 schema:name ?resource3.
}
WHERE {
  {
    SELECT ?resource1 ?resource2 WHERE {
      ?resource1 rdf:type schema:Organization;
        schema:produces ?resource2.
    }
  }
  UNION
  {
    SELECT ?resource2 ?resource3 WHERE {
      ?resource1 rdf:type schema:Organization;
        schema:produces ?resource2.
      ?resource2 schema:name ?resource3.
      FILTER((LANG(?resource3)) = "en")
    }
  }
}

Note: The Property Value Rules are typically used for projections, where a new property is computed from the inner expression. Here, however, you will see that the property schema:name is used twice. The computation is filtering of schema:name objects which are then "projected" to the same property.

Conformance

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

References

Normative References

[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://datatracker.ietf.org/doc/html/rfc2119
[S2Q-DOCS]
@hydrofoil/shape-to-query. URL: ../docs