Submodule 3.2 — Reification · Sensemaking Semantic Web

01 · Why reification exists

What do you do when a triple needs a source?

RDF's triple model is elegant: subject, predicate, object. But what happens when you need to say who claims this, in what source, at what confidence? A triple cannot have properties. You cannot write (Itachi, killed, UchihaClan) dcterms:source "manga chapter 401" — that would require the triple itself to be a subject, which plain RDF does not allow.

Reification is the set of patterns for working around this constraint. Each approach represents a different tradeoff between expressiveness, verbosity, query ergonomics, and tooling support. All four exist because none is optimal for all use cases. Choosing the right approach is a design decision — not a fact lookup.

Why the Naruto domain is perfect for this

Most knowledge graph teaching examples use clean, uncontested facts (Obama was born in Hawaii; Paris is the capital of France). The Naruto canon is deliberately messy: Itachi's motivation was presented one way for 130 chapters, then retconned in chapter 401. Different sources (manga, anime, databooks, movies, fan wikis) assert different and conflicting facts. This is exactly what real provenance-heavy systems deal with — medical literature that contradicts earlier studies, legal records that supersede earlier ones, intelligence assessments that are revised. Naruto is just more fun to work with than clinical trial metadata.

02 · The Itachi problem

The same fact, two sources, one contradiction.

Itachi Uchiha killed his entire clan. That is stated in the Naruto manga. But why he did it differs radically between two sources:

Manga chapter 131 (vol. 15, 2002): Itachi killed the clan out of personal ambition — to test his own power. This is the early narrative. Sasuke believes this for most of the series.
Manga chapter 401 (vol. 43, 2007): Itachi acted under orders from Danzo Shimura and the Konoha leadership, who feared a coup. He killed the clan to protect the village. This is the retcon — revealed 5 years and 270 chapters later.

A naive graph has: naruto:ItachiUchiha naruto:killedClan naruto:UchihaClan. One triple. No source. No motivation. Both narratives would annotate this same triple differently. Without reification, the graph cannot represent the fact that this claim is disputed — or that its meaning changed between 2002 and 2007.

This is not a Naruto-specific problem. It is the general problem of provenance-aware knowledge graphs: the same triple may be asserted by multiple sources with different confidence, dates, or interpretations. Reification is how RDF represents that.

Data files for this submodule

Load the reification exercise files from modules/03-reasoning/exercises/3-2-reification/: approach-1-classical.ttl · approach-2-nary.ttl · approach-3-named-graphs.trig · approach-4-rdf-star.ttl. The query lab below uses approach-1-classical.ttl as the primary example (classical reification is universally supported in Fuseki). Load all four into separate Fuseki datasets to compare query ergonomics.

03 · The four approaches

Same fact, four styles.

Approach 1

Classical RDF reification

Uses rdf:Statement with rdf:subject, rdf:predicate, rdf:object. Verbose — 4+ triples per annotated fact. Querying requires knowing the subject/predicate/object to locate the statement node. Universal support.

Approach 2

N-ary relation

Promotes the relationship to a first-class event/situation class. The killing becomes a naruto:NarrativeEvent individual with actor, action, motivation, and source properties. Clean IRI, easy to query, no awkward triple replication. Good for complex relationships.

Approach 3

Named graphs

Uses TriG format. Assertions go inside a named graph; the graph IRI carries metadata. Annotates groups of assertions (whole chapters) rather than individual triples. Natural for source-level provenance. Good tooling support; SPARQL GRAPH pattern is well-understood.

Approach 4

RDF-star

Quoted triples: <<S P O>> becomes a subject. Cleanest syntax — annotation attaches directly to the triple. Requires RDF 1.2 / RDF-star support (Jena 4.x, Oxigraph, GraphDB). Not yet universal (2026).

Approach	Annotates	Verbosity	Query ergonomics	Support
Classical	Individual triples	High (4+ triples/fact)	Requires S/P/O match to find statement node	Universal
N-ary	Relationships as events	Medium (new class + properties)	Query by event IRI directly	Universal
Named graphs	Groups of assertions	Low (TriG syntax)	GRAPH keyword; metadata in default graph	Good (TriG widely supported)
RDF-star	Individual triples	Very low (<<>> annotation)	SPARQL-star << >> in patterns	Partial (2026)

Approach 1 in depth — Classical RDF reification

# The base triple (still needed — reification does not replace it)
naruto:ItachiUchiha naruto:killedClan naruto:UchihaClan .

# The reification: turn the triple into a named node
naruto:ItachiKilledStatement
    a rdf:Statement ;
    rdf:subject   naruto:ItachiUchiha ;
    rdf:predicate naruto:killedClan ;
    rdf:object    naruto:UchihaClan ;
    dcterms:source     "Naruto manga, vol. 43, ch. 401"^^xsd:string ;
    sensemaking:confidence sensemaking:CanonicalManga .

Pain points: the triple is repeated (subject/predicate/object appear both in the base triple AND in the rdf:Statement). SPARQL must match all three of rdf:subject, rdf:predicate, rdf:object to locate the statement node. If you have 1,000 annotated facts, that is 4,000+ triples of overhead. This verbosity is one of the most-cited reasons practitioners choose LPG over RDF — Neo4j lets you put the metadata directly on the edge.

Approach 2 in depth — N-ary relation

# The relationship becomes a named event individual
naruto:ItachiClanMassacre
    a naruto:NarrativeEvent ;
    naruto:actor        naruto:ItachiUchiha ;
    naruto:action       naruto:KilledClan ;
    naruto:target       naruto:UchihaClan ;
    naruto:orderGivenBy naruto:DanzoShimura ;
    dcterms:source      "Naruto manga, vol. 43, ch. 401"^^xsd:string ;
    sensemaking:confidence sensemaking:CanonicalManga .

The event node has a stable IRI — query it directly without knowing subject/predicate/object. Adding new properties (witnesses, aftermath, duration) is trivial. This is the same pattern as the sensemaking:Employment event class in the resume graph from Module 1. N-ary is the natural OWL pattern when a relationship needs its own attributes.

Approach 3 in depth — Named graphs

# TriG format: assertions inside the named graph
naruto:MangaSource401 {
    naruto:ItachiUchiha naruto:killedClan     naruto:UchihaClan .
    naruto:ItachiUchiha naruto:actedUnderOrders naruto:DanzoShimura .
}

# Default graph: metadata about the named graph
{
    naruto:MangaSource401
        dcterms:date  "2007-05-14"^^xsd:date ;
        sensemaking:confidence sensemaking:CanonicalManga .
}

# SPARQL GRAPH pattern
SELECT ?source ?confidence WHERE {
  GRAPH ?g {
    naruto:ItachiUchiha naruto:killedClan naruto:UchihaClan .
  }
  ?g dcterms:date       ?date ;
     sensemaking:confidence ?confidence ;
     dcterms:source    ?source .
}

Approach 4 in depth — RDF-star

# RDF-star (requires Jena 4.x, Oxigraph, or GraphDB)
<<naruto:ItachiUchiha naruto:killedClan naruto:UchihaClan>>
    dcterms:source         "Naruto manga, vol. 43, ch. 401"^^xsd:string ;
    sensemaking:confidence sensemaking:CanonicalManga .

# SPARQL-star query
SELECT ?source ?confidence WHERE {
  <<naruto:ItachiUchiha naruto:killedClan naruto:UchihaClan>>
      dcterms:source ?source ;
      sensemaking:confidence ?confidence .
}

The cleanest syntax. No intermediate node, no S/P/O repetition. The quoted triple is both the base fact and the subject of the annotation in one expression. The limitation is tooling support — verify your stack before committing to RDF-star in production. Fuseki 4.x (included in Jena 4.x) supports it; earlier versions do not.

04 · PROV-O

The W3C standard for provenance.

PROV-O (Provenance Ontology) is the W3C standard vocabulary for describing provenance — who created what, when, from what sources. It is designed to work with any reification approach. Three core classes:

prov:Entity — anything that has provenance: a document, a dataset, a claim, a named graph.
prov:Agent — who is responsible: a person, an organization, a software process.
prov:Activity — what happened: the act of asserting, creating, deriving, or modifying an entity.

@prefix prov: <http://www.w3.org/ns/prov#> .

# The manga chapter is a prov:Entity
naruto:MangaSource401
    a prov:Entity ;
    prov:wasAttributedTo naruto:MasashiKishimoto ;  # the author
    prov:generatedAtTime "2007-05-14"^^xsd:date ;
    dcterms:isPartOf     naruto:NarutoMangaSeries .

# The author is a prov:Agent
naruto:MasashiKishimoto
    a prov:Agent , foaf:Person ;
    foaf:name "Masashi Kishimoto" .

# A named graph is attributed to the source
naruto:ItachiMassacreGraph
    a prov:Entity ;
    prov:wasDerivedFrom naruto:MangaSource401 ;
    prov:wasAttributedTo naruto:SensemakingAI .

PROV-O integrates naturally with named graphs: the named graph IRI is a prov:Entity, and the chapter or source document is what it prov:wasDerivedFrom. This is the recommended approach in the Module 3 README for Exercise 3.3: named graphs + PROV-O, with each graph's metadata including provenance chain back to the original source.

PROV-O and the anime-only distinction

The Module 3 README specifies a custom sensemaking:confidence property with values like canonical-strong, anime-only, and fan-theory. PROV-O does not have a "confidence" concept out of the box — it describes provenance chains, not confidence levels. The custom property fills this gap. PROV-O handles who said what; the custom confidence vocabulary handles how much to trust it. Both are needed together for the Naruto use case.

05 · Exercise 3.2 setup

Before writing the project — do the comparison exercise.

Exercise 3.2 guided steps — load all four approaches into Fuseki

1 Load the classical approach and run the first query expand

Load approach-1-classical.ttl into Fuseki. In the SPARQL editor, run query q01 from the query lab below — retrieve the Itachi killing fact with its source. Confirm you get two rows (the chapter 401 version and the chapter 131 version).

Write in your notes: how many lines of SPARQL did the query require? How readable is the WHERE clause? How would you find all assertions about Itachi without knowing the predicate in advance?

2 Load the n-ary approach and run the equivalent query expand

Clear the dataset and load approach-2-nary.ttl. Write a SELECT query that returns the same information: actor, action, source, confidence. Compare to the classical query — is the pattern simpler? Is the IRI for the event node more convenient than the classical reification statement node?

Write in your notes: which approach felt more natural? Which produced a more readable Turtle file?

3 Load the named graphs approach (TriG format) expand

Create a new Fuseki dataset or clear the existing one. Upload approach-3-named-graphs.trig. Fuseki handles TriG natively — no format conversion needed.

Run a SPARQL query using the GRAPH keyword pattern shown in Section 3 above. Compare: how does querying named graphs for metadata feel compared to querying the statement node in classical reification?

Note: named graphs annotate all triples in the graph collectively, not individual triples. The two assertions in naruto:MangaSource401 share the same source metadata. This is an advantage when entire chapters need the same attribution, and a limitation when individual triples within a chapter have different confidence levels.

4 Try RDF-star (if your Fuseki version supports it) expand

Load approach-4-rdf-star.ttl. In Fuseki 4.x (Jena 4), RDF-star is supported — attempt to load and run a SPARQL-star query using the <<>> syntax in the WHERE clause.

If your Fuseki version rejects the file, note the error and skip this step — the approach is not supported in your current setup. Document which version you have in notes/week-07-reasoner.md: this is a real production compatibility check.

If it works, compare the query ergonomics to classical reification. RDF-star's <<S P O>> pattern in SPARQL is significantly more readable than classical reification's three-way match on rdf:subject/predicate/object.

5 Write your comparison notes (Exercise 3.2 deliverable) expand

The Exercise 3.2 deliverable is four .ttl/.trig files — one per approach — plus SPARQL queries for each that retrieve the fact with its source. The starter files in the exercises folder give you the data; you write the queries.

Commit to exercises/3-2-reification/ alongside the starter files. Your synthesis notes for Week 7 should cover: which approach you would choose for Exercise 3.3 (the full Itachi project) and why. The Module 3 README recommends named graphs + PROV-O, but the choice is yours to make and defend.

06 · Query lab

Five queries — one fact, multiple lenses.

Queries q01–q04 run against the classical reification file (approach-1-classical.ttl). Query q05 shows the named graphs GRAPH pattern. All five demonstrate the same information retrieval goal through different reification patterns.

q01

Retrieve Itachi's killing of the clan with all source metadata — classical approach.

Pattern: classical reification query · rdf:Statement pattern · ORDER BY date

Open Fuseki ↗

PREFIX rdf:         <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs:        <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dcterms:     <http://purl.org/dc/terms/>
PREFIX naruto:      <https://sensemaking-ai.com/ns/naruto#>
PREFIX sensemaking: <https://sensemaking-ai.com/ns/>

SELECT ?source ?date ?confidence ?comment WHERE {
  ?stmt a rdf:Statement ;
        rdf:subject   naruto:ItachiUchiha ;
        rdf:predicate naruto:killedClan ;
        rdf:object    naruto:UchihaClan ;
        dcterms:source     ?source ;
        dcterms:date       ?date ;
        sensemaking:confidence ?confidence .
  OPTIONAL { ?stmt rdfs:comment ?comment . }
}
ORDER BY ?date

Expected output

2 rows · "Naruto manga, vol. 15, ch. 131" / 2002-06-03 / CanonicalDisputed · "Naruto manga, vol. 43, ch. 401" / 2007-05-14 / CanonicalManga

The query requires knowing all three of rdf:subject, rdf:predicate, and rdf:object to locate the statement nodes. This is classical reification's primary query burden — you must reconstruct the original triple pattern to find its annotations. If you only know the subject (Itachi) and want all annotated triples about him, the query becomes more complex.

Think about this

1. Write a query that finds all rdf:Statement nodes where rdf:subject = naruto:ItachiUchiha, regardless of predicate. How many results? Is the query pattern more or less readable than q01?

2. What would the equivalent query look like against the n-ary approach (approach-2-nary.ttl)? Hint: you query the event IRI directly without needing to reconstruct the triple.

3. Which confidence level should a downstream application trust: CanonicalManga or CanonicalDisputed? Write an ASK query that returns true only if there is a CanonicalManga-confidence source for the killing fact.

q02

Find all annotations about Itachi, by any predicate, from manga-canonical sources only.

Pattern: classical reification · filtering by confidence · subject-only lookup

Open Fuseki ↗

PREFIX rdf:         <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dcterms:     <http://purl.org/dc/terms/>
PREFIX naruto:      <https://sensemaking-ai.com/ns/naruto#>
PREFIX sensemaking: <https://sensemaking-ai.com/ns/>

SELECT ?predicate ?object ?source WHERE {
  ?stmt a rdf:Statement ;
        rdf:subject   naruto:ItachiUchiha ;
        rdf:predicate ?predicate ;
        rdf:object    ?object ;
        dcterms:source ?source ;
        sensemaking:confidence sensemaking:CanonicalManga .
}
ORDER BY ?predicate

Expected output

2 rows (from approach-1-classical.ttl) · naruto:actedUnderOrders / naruto:DanzoShimura / "vol. 43, ch. 401" · naruto:killedClan / naruto:UchihaClan / "vol. 43, ch. 401"

The CanonicalDisputed statement (ch. 131) is filtered out. This pattern — retrieve all manga-canonical facts about a character — is a common provenance filtering operation. It is the query the Module 3 README specifies for Exercise 3.3 step 4.

Think about this

1. Change the confidence filter to sensemaking:CanonicalDisputed. What appears? This represents the early-series narrative that was later retconned.

2. Remove the confidence filter entirely. How many rows? Which source appears for each row? This is the "all claims" view — useful for identifying contradictions between sources.

3. What SPARQL would you write to find all characters whose backstory has both a CanonicalManga and a CanonicalDisputed assertion with the same predicate? This is the "retroactive retcon detection" query from Exercise 3.3 step 4.

q03

Build a "reliability timeline" — all claims about Itachi ordered by reveal date.

Pattern: classical reification · ORDER BY date · the provenance audit trail

Open Fuseki ↗

PREFIX rdf:         <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs:        <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dcterms:     <http://purl.org/dc/terms/>
PREFIX naruto:      <https://sensemaking-ai.com/ns/naruto#>
PREFIX sensemaking: <https://sensemaking-ai.com/ns/>

SELECT ?date ?predLabel ?confidence ?source WHERE {
  ?stmt a rdf:Statement ;
        rdf:subject   naruto:ItachiUchiha ;
        rdf:predicate ?predicate ;
        rdf:object    ?object ;
        dcterms:source ?source ;
        dcterms:date   ?date ;
        sensemaking:confidence ?confidence .
  OPTIONAL { ?predicate rdfs:label ?predLabel . }
}
ORDER BY ?date

Expected output

3 rows ordered by date · 2002 ch.131 (CanonicalDisputed) · 2007 ch.401 killedClan (CanonicalManga) · 2007 ch.401 actedUnderOrders (CanonicalManga)

This is the "reliability timeline" query from the Module 3 README Exercise 3.3 step 4. Reading the results chronologically shows the narrative evolution: the early disputed claim, then the canonical retcon two facts. For a richer dataset with more characters and more sources, this query produces the complete provenance audit trail.

Think about this

1. This query is specific to Itachi. Generalize it: remove the fixed subject and add a GROUP BY to count how many assertions exist per character, ordered by the number of contested/disputed assertions. This identifies which characters have the most narrative ambiguity.

2. In Exercise 3.3, you will pick ~20-30 assertions. Which characters in the Naruto universe besides Itachi have the most contested or retconned backstories? These are your candidates for the richer dataset.

q04

Find the predicate where Itachi has conflicting claims across sources.

Pattern: self-join on rdf:Statement · detecting contradiction between sources

Open Fuseki ↗

PREFIX rdf:         <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dcterms:     <http://purl.org/dc/terms/>
PREFIX naruto:      <https://sensemaking-ai.com/ns/naruto#>
PREFIX sensemaking: <https://sensemaking-ai.com/ns/>

SELECT DISTINCT ?predicate ?source1 ?source2 WHERE {
  ?s1 a rdf:Statement ;
      rdf:subject   naruto:ItachiUchiha ;
      rdf:predicate ?predicate ;
      rdf:object    naruto:UchihaClan ;
      dcterms:source ?source1 ;
      sensemaking:confidence ?conf1 .
  ?s2 a rdf:Statement ;
      rdf:subject   naruto:ItachiUchiha ;
      rdf:predicate ?predicate ;
      rdf:object    naruto:UchihaClan ;
      dcterms:source ?source2 ;
      sensemaking:confidence ?conf2 .
  FILTER (?s1 != ?s2 && ?conf1 != ?conf2)
  FILTER (STR(?s1) < STR(?s2))
}
ORDER BY ?predicate

Expected output

1 row · naruto:killedClan / "vol. 15, ch. 131" / "vol. 43, ch. 401" — one predicate with two statements at different confidence levels

This is the "retroactive retcon detection" query. For the killedClan predicate, two statements exist with different confidence levels (CanonicalDisputed vs CanonicalManga). The FILTER on confidence ensures we only find genuine conflicts — not duplicate statements from the same source.

Think about this

1. Remove the confidence filter (?conf1 != ?conf2). How many rows? What does the increase tell you about the data structure?

2. Generalize this query to find ANY character in the dataset with conflicting claims — not just Itachi. Remove the fixed subject and add it as a projected variable. When you build the Exercise 3.3 dataset, run this query to validate that your provenance structure correctly captures the contradictions.

q05

Named graphs approach: retrieve fact + metadata using GRAPH keyword.

Pattern: named graph query · GRAPH pattern · default graph for metadata

Open Fuseki ↗

PREFIX dcterms:     <http://purl.org/dc/terms/>
PREFIX schema:      <https://schema.org/>
PREFIX naruto:      <https://sensemaking-ai.com/ns/naruto#>
PREFIX sensemaking: <https://sensemaking-ai.com/ns/>

# Load approach-3-named-graphs.trig into Fuseki before running this query.
SELECT ?predicate ?object ?graphName ?date ?confidence WHERE {
  GRAPH ?g {
    naruto:ItachiUchiha ?predicate ?object .
  }
  ?g schema:name ?graphName ;
     dcterms:date ?date ;
     sensemaking:confidence ?confidence .
}
ORDER BY ?date ?predicate

Expected output (against approach-3-named-graphs.trig)

4 rows · ch.131 graph: killedClan, motivation (CanonicalDisputed) · ch.401 graph: killedClan, actedUnderOrders, motivation (CanonicalManga)

The GRAPH keyword finds all named graphs containing assertions about Itachi, then joins with the default graph for metadata. This is the named graphs ergonomic advantage: a single GRAPH pattern retrieves all assertions in a named graph without needing to know each triple's subject/predicate/object upfront. The default graph holds the provenance metadata for each named graph.

Think about this

1. The named graphs approach cannot annotate individual triples within a named graph differently. Both assertions in naruto:MangaSource401 share the same confidence. How would you model a situation where one triple in a chapter is definitive and another is ambiguous? Would you split them into separate named graphs?

2. For Exercise 3.3, which approach would you choose: classical, n-ary, named graphs, or RDF-star? Write a one-paragraph justification in your Week 7 synthesis notes before proceeding.

07 · Bridge to Exercise 3.3

The primary project: Naruto reification with Itachi's backstory.

Exercise 3.3 (primary project A) extends the Itachi example into a full 20-30 assertion provenance graph. The Module 3 README has the full steps. Two things to decide before starting:

Choose your reification approach before writing the first triple

Named graphs + PROV-O is the Module 3 README recommendation — it is well-supported, scales to many assertions, and integrates cleanly with PROV-O's provenance chain. If your Fuseki supports RDF-star, that is also defensible. Classical reification is the fallback if you need universal compatibility. Document your choice in the Exercise 3.3 artifact's README.md under "Design decisions."

Characters with the richest contested canon

Itachi is the canonical example (revelation in ch. 401). Other rich candidates for the 20-30 assertion dataset: Obito Uchiha (true identity revealed 300 chapters after introduction), Minato Namikaze (Naruto's father, revealed mid-series), Nagato/Pain (backstory retold from a different perspective in the Pain arc). Each has clearly identifiable source chapters and clear before/after narratives.

The artifact lives at modules/03-reasoning/artifacts/naruto-reification/ with a data.ttl (or data.trig), queries/ folder, README.md, and REFLECTION.md. The reflection — would Neo4j have been easier? — is the blog post seed.

08 · Resources

Reading and tools.

Primary reading

Allemang et al. — Ch 11–12

Chapter 11 covers n-ary relations and the reification patterns in depth. Chapter 12 covers named graphs and provenance. Read both before starting Exercise 3.3.

Provenance spec

W3C PROV-O

The W3C provenance ontology. The primer (prov-primer) is more readable than the spec itself. Read the primer, then refer to the spec when implementing. Focus on prov:Entity, prov:Agent, prov:Activity, and wasDerivedFrom.

RDF-star spec

W3C RDF-star

The candidate recommendation for RDF 1.2's quoted triple syntax. Check your triplestore's release notes for SPARQL-star support before using <<>> in production queries.

Exercise files

Reification exercise folder

The four starter files: approach-1-classical.ttl, approach-2-nary.ttl, approach-3-named-graphs.trig, approach-4-rdf-star.ttl. Load into Fuseki and run the Exercise 3.2 comparison steps.

Next submodule

Submodule 3.3 — SHACL and alignment

SHACL in depth, vocabulary alignment (owl:sameAs vs skos:exactMatch), and the Resume skill inference project — primary project B.

Cheatsheet

Reification comparison one-pager (coming soon)

The four approaches side by side: sample Turtle, sample SPARQL query, verbosity rating, support matrix. Available when Module 3 materials are complete.