Sensemaking AI · Sensemaking Semantic Web
Submodule 3.3 · Reasoning ★

SHACL and vocabulary alignment.

Closed-world validation in an open-world graph. owl:sameAs vs skos:exactMatch. Formal skill inference applied to resume data — and where it beats (and loses to) LLM reasoning.

Module 3 · Weeks 7–9 Primary project B: Exercise 3.4 Requires Fuseki + resume-001.ttl SHACL · alignment · skill inference
Checking for Fuseki at localhost:3030…

Going deeper than the Module 2 introduction.

Submodule 2.3 introduced SHACL as the closed-world answer to OWL's open-world assumption. This submodule goes deeper: the full constraint component vocabulary, SPARQL-based custom constraints, severity levels, and how SHACL fits into a production data pipeline.

The SHACL data model

SHACL defines two kinds of shapes:

# A NodeShape targeting all naruto:Ninja nodes
naruto:NinjaShape
    a sh:NodeShape ;
    sh:targetClass naruto:Ninja ;
    sh:property [
        sh:path     naruto:canonicalName ;
        sh:datatype xsd:string ;
        sh:minCount 1 ;
        sh:maxCount 1 ;
        sh:message  "A Ninja must have exactly one canonicalName."@en ;
        sh:severity sh:Violation
    ] ;
    sh:property [
        sh:path     naruto:memberOfVillage ;
        sh:class    naruto:Village ;
        sh:minCount 1 ;
        sh:message  "A Ninja must belong to at least one Village."@en
    ] .

Key constraint components

ComponentConstrainsExample
sh:minCount / sh:maxCountNumber of valuesExactly one canonicalName
sh:datatypeLiteral typecanonicalName must be xsd:string
sh:classObject typememberOfVillage value must be a naruto:Village
sh:patternRegex on literalsvillageCode matches "[A-Z]{2}"
sh:inAllowed valuesconfidence in {canonical, disputed, fan-theory}
sh:hasValueRequired specific valuea node must be of rdf:type naruto:Ninja
sh:nodeReference another shapeemployer must conform to OrganizationShape
sh:sparqlCustom SPARQL constraintNo circular senseiOf chains

Severity levels

SHACL has three built-in severity levels, attached with sh:severity:

Use severity to express policy rather than binary pass/fail. A missing schema:url on an organization might be a Warning; a missing canonicalName on a Ninja is a Violation. Both constraints live in the same shape file.

SPARQL-based constraints

For constraints that basic SHACL components cannot express, sh:sparql allows embedding a SPARQL query that returns violation results.

# Custom constraint: no ninja should be their own sensei
naruto:NinjaShape sh:sparql [
    a sh:SPARQLConstraint ;
    sh:message "A Ninja cannot be their own sensei."@en ;
    sh:severity sh:Violation ;
    sh:select """
        PREFIX naruto: <https://sensemaking-ai.com/ns/naruto#>
        SELECT $this WHERE {
            $this naruto:senseiOf $this .
        }
    """ ;
] .
SHACL vs OWL restrictions — choosing between them

Both SHACL and OWL can say "a Ninja must have a Village." They mean different things. OWL's existential restriction says "if this is a Ninja, there must exist some Village it belongs to — possibly unasserted." SHACL's minCount says "if this is a Ninja and there is no asserted Village triple, this is a validation error." Use OWL for reasoning (what can be inferred). Use SHACL for data quality (what must be present). Both can coexist in the same system serving different purposes.

Not all equivalences are equal.

When connecting your graph to an external vocabulary or dataset, the equivalence predicate you choose has consequences that compound through reasoning. The wrong choice is usually owl:sameAs applied where skos:exactMatch should go.

PredicateAssertsReasoner consequenceUse when
owl:sameAs These two IRIs refer to the same individual — full identity All properties from both nodes merge. Every triple about A is also about B and vice versa. You actually intend full property merging. Rare in practice — usually too strong.
skos:exactMatch These two concepts are semantically equivalent in their respective vocabularies No property merging. Linked for vocabulary alignment purposes only. Linking a local skill concept to an ESCO concept. Linking a Wikidata entity to your ontology individual.
skos:closeMatch These two concepts are similar but not identical No property merging. Weaker alignment signal. A local "data engineering" skill is close to but not exactly ESCO's "apply data engineering methods."
skos:broadMatch The local concept is narrower than the external concept No property merging. Hierarchical signal for cross-vocabulary navigation. A specific "PyTorch proficiency" skill broadly matches ESCO's "machine learning" concept.
owl:equivalentClass These two classes have exactly the same extension Instances of one class are inferred to be instances of the other. Your naruto:Ninja class is declared equivalent to an external ontology's Fighter class — intentional class merging.
The owl:sameAs hammer in practice

Asserting :BarbaraHidalgo owl:sameAs wikidata:Q12345678 causes a reasoner to merge every Wikidata property into your local node — birth date, nationality, Wikipedia categories, external identifiers, everything. If your graph has a sensemaking:personalBlog property, it now transfers to the Wikidata entity. This is almost never what you want. Use skos:exactMatch instead: it signals "same referent" without triggering property inheritance. Reserve owl:sameAs for cases where you genuinely need two IRIs to behave identically under all reasoning.

What formal rules can surface that a resume doesn't state.

The Resume Graph Explorer contains explicitly asserted skills — what someone listed on their resume. Formal inference rules can surface skills that are implied by the combination of asserted skills and ESCO's vocabulary structure, or by the duration and nature of work experience.

The inference rules (from the Module 3 README)

# Rule 1: ESCO-related exposure
# If a person has skill X and X is skos:related to Y in ESCO,
# infer sensemaking:hasInferredSkill ?person ?y at "exposure" level.

CONSTRUCT {
  ?person sensemaking:hasInferredSkill ?relatedConcept .
  _:basis a sensemaking:InferenceBasis ;
          sensemaking:inferenceRule "esco-related-exposure" ;
          sensemaking:exposureLevel "exposure" .
}
WHERE {
  ?person sensemaking:hasSkill ?skill .
  ?skill skos:exactMatch/skos:related ?relatedConcept .
  FILTER NOT EXISTS {
    ?person sensemaking:hasSkill ?s2 .
    ?s2 skos:exactMatch ?relatedConcept .
  }
}
# Rule 2: Long-tenure promotion
# If a person held a role for 2+ years AND has a skill at "advanced",
# escalate skos:related concepts from "exposure" to "demonstrated".

CONSTRUCT {
  ?person sensemaking:hasInferredSkill ?relatedConcept .
  _:basis a sensemaking:InferenceBasis ;
          sensemaking:inferenceRule "long-tenure-promotion" ;
          sensemaking:exposureLevel "demonstrated" .
}
WHERE {
  ?person sensemaking:hasEmployment ?emp ;
          sensemaking:hasSkill ?skill .
  ?skill skos:exactMatch/skos:related ?relatedConcept ;
         sensemaking:skillLevel "advanced" .
  ?emp sensemaking:startDate ?start ;
       sensemaking:endDate   ?end .
  FILTER ((?end - ?start) > "P2Y"^^xsd:duration)
  FILTER NOT EXISTS {
    ?person sensemaking:hasSkill ?s2 .
    ?s2 skos:exactMatch ?relatedConcept .
  }
}

What to load for the query lab

Load both modules/01-foundations/artifacts/resume-graph/ttl/resume-001.ttl (the Alex Rivera resume) and modules/03-reasoning/artifacts/skill-inference/skill-inference-starter.ttl (the ESCO-related stubs and inference property declarations) into the same Fuseki dataset before running the query lab. The queries below apply the inference rules to Alex's resume data.

Five queries — validation, alignment, and inference.

q01
What SHACL violations would Alex's resume have against the NinjaShape equivalent?
Pattern: SHACL validation simulation via SPARQL · checking minCount constraints manually
Open Fuseki ↗
PREFIX foaf:        <http://xmlns.com/foaf/0.1/>
PREFIX dcterms:     <http://purl.org/dc/terms/>
PREFIX sensemaking: <https://sensemaking-ai.com/ns/>

# Simulate a PersonShape validation:
# - Every foaf:Person must have foaf:name (minCount 1)
# - Every sensemaking:Resume must have dcterms:created (minCount 1)
# Find violations: persons without a name

SELECT ?person ?violation WHERE {
  ?person a foaf:Person .
  BIND("missing foaf:name" AS ?violation)
  FILTER NOT EXISTS { ?person foaf:name ?name . }
}
UNION
{
  ?resume a sensemaking:Resume .
  BIND("missing dcterms:created" AS ?violation)
  FILTER NOT EXISTS { ?resume dcterms:created ?date . }
}
ORDER BY ?person
q02
Which resume skills have exactMatch to ESCO, and which have skos:related links available?
Pattern: vocabulary alignment audit · exactMatch vs related · inference readiness check
Open Fuseki ↗
PREFIX foaf:        <http://xmlns.com/foaf/0.1/>
PREFIX skos:        <http://www.w3.org/2004/02/skos/core#>
PREFIX sensemaking: <https://sensemaking-ai.com/ns/>

SELECT ?skillLabel ?hasExactMatch ?relatedCount WHERE {
  ?person a foaf:Person ;
          sensemaking:hasSkill ?skill .
  ?skill skos:prefLabel ?skillLabel .
  FILTER (LANG(?skillLabel) = "en")
  BIND(EXISTS { ?skill skos:exactMatch ?esco . } AS ?hasExactMatch)
  {
    SELECT ?skill (COUNT(?related) AS ?relatedCount) WHERE {
      OPTIONAL { ?skill skos:exactMatch/skos:related ?related . }
    } GROUP BY ?skill
  }
}
ORDER BY DESC(?relatedCount)
q03
CONSTRUCT: apply Rule 1 — infer skills from ESCO related concepts.
Pattern: CONSTRUCT as inference rule · skos:exactMatch/skos:related path · FILTER NOT EXISTS for new-only
Open Fuseki ↗
PREFIX foaf:        <http://xmlns.com/foaf/0.1/>
PREFIX skos:        <http://www.w3.org/2004/02/skos/core#>
PREFIX sensemaking: <https://sensemaking-ai.com/ns/>

CONSTRUCT {
  ?person sensemaking:hasInferredSkill ?relatedConcept .
}
WHERE {
  ?person a foaf:Person ;
          sensemaking:hasSkill ?skill .
  ?skill skos:exactMatch/skos:related ?relatedConcept .
  FILTER NOT EXISTS {
    ?person sensemaking:hasSkill ?s2 .
    ?s2 skos:exactMatch ?relatedConcept .
  }
}
q04
CONSTRUCT: apply Rule 2 — escalate to "demonstrated" for long-tenure roles.
Pattern: date arithmetic in SPARQL · tenure-based inference escalation
Open Fuseki ↗
PREFIX foaf:        <http://xmlns.com/foaf/0.1/>
PREFIX xsd:         <http://www.w3.org/2001/XMLSchema#>
PREFIX skos:        <http://www.w3.org/2004/02/skos/core#>
PREFIX sensemaking: <https://sensemaking-ai.com/ns/>

# Skills inferred at "demonstrated" level:
# person held a role for 2+ years AND the asserted skill is "advanced"
CONSTRUCT {
  ?person sensemaking:hasDemonstratedSkill ?relatedConcept .
}
WHERE {
  ?person a foaf:Person ;
          sensemaking:hasEmployment ?emp ;
          sensemaking:hasSkill ?skill .
  ?skill skos:exactMatch/skos:related ?relatedConcept ;
         sensemaking:skillLevel "advanced" .
  ?emp sensemaking:startDate ?start .
  OPTIONAL { ?emp sensemaking:endDate ?end . }
  BIND(COALESCE(?end, "2026-06-03"^^xsd:date) AS ?effectiveEnd)
  FILTER ((?effectiveEnd - ?start) > "P730D"^^xsd:duration)
  FILTER NOT EXISTS {
    ?person sensemaking:hasSkill ?s2 .
    ?s2 skos:exactMatch ?relatedConcept .
  }
}
q05
What skills do the formal rules infer — and which of those would an LLM also identify?
Pattern: inference result summary · the comparison query that sets up Exercise 3.4
Open Fuseki ↗
PREFIX foaf:        <http://xmlns.com/foaf/0.1/>
PREFIX skos:        <http://www.w3.org/2004/02/skos/core#>
PREFIX sensemaking: <https://sensemaking-ai.com/ns/>

# After running q03 and q04 CONSTRUCTs and loading the results:
# Summary of all inferred skills with their labels
SELECT DISTINCT ?skillLabel ?inferenceType WHERE {
  ?person a foaf:Person .
  {
    ?person sensemaking:hasInferredSkill ?concept .
    BIND("exposure" AS ?inferenceType)
  }
  UNION
  {
    ?person sensemaking:hasDemonstratedSkill ?concept .
    BIND("demonstrated" AS ?inferenceType)
  }
  ?concept skos:prefLabel ?skillLabel .
  FILTER (LANG(?skillLabel) = "en")
}
ORDER BY ?inferenceType ?skillLabel

Where each one wins — and where it doesn't.

This comparison is the consulting-positioning artifact the Module 3 README describes as "highest-leverage." The honest version of the comparison — not a promotional piece for either approach — is what makes it valuable.

PropertyFormal reasoning (OWL + SPARQL)LLM inference
ReproducibilityDeterministic: same data + same rules = same result, always.Non-deterministic: results vary across runs and model versions.
CoverageBounded by vocabulary links and rule coverage. Can miss obvious skills not in ESCO.Broad: draws on training data. Can surface skills the vocabulary doesn't cover.
ExplainabilityEvery inferred skill traces to a specific rule and evidence triple. Fully auditable.Opaque: the reasoning is distributed across billions of parameters. Not auditable.
Hallucination riskNone: inferences are strictly derived from asserted data. Cannot produce a skill not in the graph.High: LLMs can confidently assert skills that have no basis in the resume text.
Maintenance costHigh: rules must be written, tested, and updated as the vocabulary evolves.Low: the model updates handle most improvement without manual rule changes.
NuanceLow: rules are binary — the condition is met or it is not. Cannot handle "probably" without custom confidence scoring.High: LLMs handle hedged, contextual, and domain-specific inferences naturally.
The honest consulting position

Neither approach is uniformly better. Formal reasoning wins where reproducibility, auditability, and hallucination risk matter — compliance, legal, medical, regulatory. LLM inference wins where coverage, nuance, and low maintenance matter — exploratory discovery, consumer-facing features, rapid iteration. The most honest answer — and the most valuable consulting position — is knowing which context you are in and which risks your client is willing to accept. Building both and comparing them on real data (Exercise 3.4) is how you earn the right to say that confidently.

The primary project: Resume skill inference.

Exercise 3.4 (primary project B) builds the Jupyter notebook that demonstrates skill inference end-to-end on 5–10 resume profiles and compares formal rules to LLM inference. The query lab above provides the formal rules side. For the LLM side:

The LLM prompt template

For each resume: "Given the following resume data [paste plain-text resume], what skills would you infer that are NOT explicitly listed? List only skills implied by the work experience, not mentioned directly." Run this prompt with the same resume text used in the formal inference. Record both outputs.

The comparison structure

For each profile, create a table with four columns: Skill name · Formal rules found it (yes/no) · LLM found it (yes/no) · Is it actually a reasonable inference? (your judgment). The fourth column is the ground truth — and your judgment IS the data. After 5-10 profiles, patterns emerge: formal rules miss X, LLM hallucinates Y, both agree on Z. That pattern is the blog post.

The artifact lives at modules/03-reasoning/artifacts/skill-inference/ as a Jupyter notebook. The Module 3 README says this connects directly to the existing narrative synthesizer work — the comparison result is positioning gold for the consulting practice.

Reading, tools, and next module.

SHACL spec

W3C SHACL

Section 2 (Data Shapes) and Section 5 (Core Constraint Components) are the most relevant. The SPARQL-based constraints (§8) are the advanced feature used in the NinjaShape circular-reference example.

SHACL book

Validating RDF Data

Labra Gayo et al. — open access. The most complete treatment of SHACL and ShEx available. Chapters 3–5 cover the constraint model and validation semantics. Keep open as a reference during Exercise 3.4.

Primary reading

Allemang et al. — Ch 13

Chapter 13 covers vocabulary alignment and the sameAs / exactMatch distinction in production contexts. The examples are from life sciences and finance — different domains, same design decisions.

Starter data

skill-inference-starter.ttl

ESCO-related concept stubs, inference property declarations, and the three inference rule templates as comments. Load alongside resume-001.ttl before running the query lab.

Vocabulary alignment

ESCO API

The ESCO portal's Linked Data download and API. For Exercise 3.4, download the ESCO skills taxonomy as RDF and load it into Fuseki to replace the stubs with real concept IRIs and real skos:related links.

Next module

Module 4 — Shipping

SPARQL UPDATE, federation, deployment, and the TwinKit Semantic v2.0 capstone. Module 4 depends on having the Naruto ontology (Module 2) and either reification artifact (3.3) or skill inference (3.4) ready as demo data.