A relational database stores rows and columns. SQL joins them. An RDF store keeps triples — subject, predicate, object. SPARQL matches patterns against those triples the way a regular expression matches text: flexible, composable, and ignorant of storage layout.
The conceptual move is straightforward. A SPARQL graph pattern is a set of triple templates where any position can be a variable or a fixed term. The engine finds every assignment of values to variables that makes all templates simultaneously true against the data. That is the entire core idea. Everything else — filtering, aggregating, sorting, constructing new graphs — is built on top of that matching step.
Where SQL asks "give me all rows from the Orders table where customer_id = 42," SPARQL asks "give me every ?subject that has a foaf:knows predicate to at least one ?object whose name matches 'Alice.'" The difference is not just syntax. RDF has no schema that predetermines which predicates a node might have. SPARQL's pattern matching is the only way to navigate that open-ended structure.
A triple pattern is a template with one or more variables. Wrap a set of them in a WHERE clause and the engine finds every combination of values that matches all of them at once:
SELECT ?name ?jutsu WHERE { ?ninja a sensemaking:Ninja ; schema:name ?name ; sensemaking:hasJutsu ?jutsu . }
Three triple patterns. The engine binds ?ninja, ?name, and ?jutsu to every combination in the graph that satisfies all three simultaneously. A ninja with two jutsu appears twice — one row per jutsu. This is not a flaw; it is how triple-pattern matching works, and it matters when you use OPTIONAL.
SPARQL triple patterns map roughly to Cypher's MATCH clause. The key difference: Cypher's graph model lets you put properties on edges (-[:KNOWS {since: 2020}]->). In SPARQL, "since 2020" requires a separate triple — either a new node (event-class pattern) or RDF-star annotation. That is the same trade-off as Sub-module 1.1.
You write a query in a .rq file (or a text box). You POST it to an endpoint URL. The endpoint returns results — a table of variable bindings for SELECT, a boolean for ASK, a new RDF graph for CONSTRUCT or DESCRIBE. The protocol is standard; the results format is negotiated via HTTP content-type headers.
SERVICE wikibase:label is the most common example.Local Fuseki runs on your machine at localhost:3030. Fast, private, no rate limits. Data you load is yours to query — the Naruto and mythology datasets from the starter kit, your own Turtle files. This is the workbook environment.
Wikidata's query service at query.wikidata.org is a public, read-only SPARQL endpoint over one of the largest structured knowledge graphs on the web. No account needed. Rate-limited (queries must complete in 60 seconds). Wikidata uses the standard SPARQL protocol with a few proprietary extensions — most importantly SERVICE wikibase:label, which resolves human-readable labels for any entity. Section 5 has three runnable queries.
Wikidata identifies entities by opaque IRIs like wd:Q49108 (MIT). Human-readable labels are separate rdfs:label values — often dozens of them in different languages. The SERVICE wikibase:label extension picks the right one automatically. It is a convenience wrapper, not part of the SPARQL standard.
A SELECT query broken into its named components:
① PREFIX declarations — short aliases for full IRIs PREFIX schema: <https://schema.org/> PREFIX sensemaking: <https://sensemaking-ai.com/ns/example#> ② SELECT clause — which variables to project into the result table SELECT ?name ?teamName ③ WHERE clause — graph pattern that must match WHERE { ?ninja a sensemaking:Ninja ; schema:name ?name ; sensemaking:memberOfTeam ?team . ?team schema:name ?teamName . } ④ Solution modifiers — post-processing on the matched rows ORDER BY ?teamName ?name LIMIT 50
PREFIX lines define short aliases so you can write schema:name instead of <https://schema.org/name>. They are purely syntactic — not stored, not inferred, not shared between queries. Every query that needs an IRI must declare its own PREFIX for it. The curriculum's standard prefix block is the starting point for every file.
Lists the variables to include in the result table. Use SELECT * to project all bound variables. Use SELECT DISTINCT to remove duplicate rows — useful when the same combination appears through multiple matching paths. You can also compute expressions: SELECT (COUNT(?ninja) AS ?total).
The graph pattern. Each line inside WHERE { } is a triple pattern — subject, predicate, object, where any position can be a variable (?x) or a fixed term (IRI or literal). All triple patterns in the same block must match simultaneously for a row to appear in results. OPTIONAL patterns relax this: OPTIONAL { ?ninja sensemaking:hasJutsu ?j } keeps the ninja in results even if they have no jutsu — ?j is simply unbound for those rows.
Applied after the pattern matching, in this order: GROUP BY (aggregate into groups) → HAVING (filter on aggregated values) → ORDER BY (sort) → OFFSET / LIMIT (paginate). You can use all, some, or none.
Multiple predicates for the same subject can be chained with semicolons: ?ninja schema:name ?name ; sensemaking:hasJutsu ?jutsu . is equivalent to two separate triple patterns with the same subject. This is Turtle syntax carried into SPARQL WHERE clauses — same rule, same effect.
The WHERE clause works identically across all four forms. What changes is what the query does with the matched solutions.
The everyday form. Each matched solution becomes one row; each projected variable becomes one column. Most SPARQL tooling defaults to showing SELECT results as a table or JSON object.
Tests whether any solution to the WHERE clause exists. Faster than SELECT ... LIMIT 1 and more readable when existence is the only question. Use it for validation, guards, and assertions.
Maps matched solutions onto a triple template to produce new triples. Use it to derive inferred relationships, transform data between vocabularies, or materialize a view of the graph for external consumption.
Returns an RDF graph containing everything the endpoint knows about one or more resources. What "everything" means is implementation-defined — Fuseki returns all triples where the URI appears as subject. Good for exploration; less useful when you need a precise result shape.
The most important thing to understand about SELECT is how solution cardinality works. Each row in the result represents one way of satisfying all the triple patterns simultaneously. If a ninja has two jutsu, the same ninja appears twice — once per jutsu. If you want one row per ninja, you have two options: aggregate with COUNT/GROUP_CONCAT, or remove the jutsu triple pattern and accept that jutsu data isn't in this query.
# Returns N rows — one per (ninja, jutsu) combination SELECT ?name ?jutsuLabel WHERE { ?ninja schema:name ?name ; sensemaking:hasJutsu ?j . ?j skos:prefLabel ?jutsuLabel . } # Returns one row per ninja, jutsu concatenated SELECT ?name (GROUP_CONCAT(?jutsuLabel; separator=", ") AS ?jutsu) WHERE { ?ninja schema:name ?name . OPTIONAL { ?ninja sensemaking:hasJutsu ?j . ?j skos:prefLabel ?jutsuLabel . } } GROUP BY ?name
ASK is an assertion, not a lookup. The most important thing to know about it: a result of false does not mean the thing does not exist. Under the open-world assumption, it means the pattern was not satisfied by data in this endpoint at this moment. "No data asserting X" is not the same as "X is false."
# true if any ninja has the Sharingan in this dataset ASK WHERE { ?ninja sensemaking:hasJutsu sensemaking:Sharingan . } # false — but that means only "not asserted here" ASK WHERE { ?ninja sensemaking:hasJutsu sensemaking:FireStyleJutsu . }
CONSTRUCT takes a template — triples with variable positions — and fills it in for each matched solution. The result is a new RDF graph, not a table. In Fuseki, switch the result format dropdown to Turtle to read it.
# Infer "colleague" from shared team membership CONSTRUCT { ?a sensemaking:colleagueOf ?b . } WHERE { ?a sensemaking:memberOfTeam ?team . ?b sensemaking:memberOfTeam ?team . FILTER (STR(?a) < STR(?b)) }
CONSTRUCT is how you materialize inferred triples that a reasoner would derive — useful when you want to precompute results for performance, or when you're transforming data from one vocabulary to another without running a full reasoner.
DESCRIBE is the most exploratory form. Give it a URI and it returns whatever the endpoint considers "relevant" about that resource. Fuseki's default is all triples where the URI is the subject. Some endpoints include incoming triples too (triples where the URI is the object).
# Everything the endpoint knows about Kakashi DESCRIBE sensemaking:KakashiHatake # DESCRIBE also accepts WHERE clauses DESCRIBE ?ninja WHERE { ?ninja sensemaking:senseiOf sensemaking:NarutoUzumaki . }
DESCRIBE is most useful when you do not yet know what predicates a resource has — it is the "tell me everything" query. Once you know the shape of the data, SELECT is almost always clearer and more precise.
Default to SELECT. Use ASK when you need a boolean check. Use CONSTRUCT when you need to produce new triples or transform data. Use DESCRIBE when exploring unknown data — find out what predicates exist, then write a targeted SELECT.
These three queries run against Wikidata's public SPARQL endpoint at query.wikidata.org. Copy each one, paste it into the query editor, click Run. Each one also matches an exercise from Exercise 1.1.
Wikidata uses wd: for entity IRIs (wd:Q49108 = MIT) and wdt: for property IRIs (wdt:P69 = "educated at"). The SERVICE wikibase:label block is a Wikidata extension that resolves human-readable labels — without it you see raw IRIs. These conventions are Wikidata-specific; standard SPARQL endpoints do not have them.
PREFIX wd: <http://www.wikidata.org/entity/> PREFIX wdt: <http://www.wikidata.org/prop/direct/> PREFIX wikibase: <http://wikiba.se/ontology#> PREFIX bd: <http://www.bigdata.com/rdf#> SELECT ?book ?bookLabel ?pubYear WHERE { ?book wdt:P50 wd:Q21050585 . # P50 = author; Q21050585 = Tchaikovsky OPTIONAL { ?book wdt:P577 ?pubDate . BIND(YEAR(?pubDate) AS ?pubYear) } SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . } } ORDER BY DESC(?pubYear)
PREFIX wd: <http://www.wikidata.org/entity/> PREFIX wdt: <http://www.wikidata.org/prop/direct/> PREFIX wikibase: <http://wikiba.se/ontology#> PREFIX bd: <http://www.bigdata.com/rdf#> PREFIX schema: <https://schema.org/> SELECT ?person ?personLabel WHERE { ?person wdt:P69 wd:Q49108 . # P69 = educated at; Q49108 = MIT ?article schema:about ?person ; schema:inLanguage "en" ; schema:isPartOf <https://en.wikipedia.org/> . SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . } } ORDER BY ?personLabel LIMIT 30
PREFIX wd: <http://www.wikidata.org/entity/> PREFIX wdt: <http://www.wikidata.org/prop/direct/> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT ?occ ?label WHERE { ?occ wdt:P31 wd:Q28640 ; # P31 = instance of; Q28640 = occupation rdfs:label ?label . FILTER (LANG(?label) = "en") FILTER (CONTAINS(LCASE(?label), "data")) } ORDER BY ?label
Wikidata queries must complete in 60 seconds. Complex patterns without LIMIT, or patterns that hit very large sets (all people, all items), will time out. Always add a LIMIT clause when exploring. The helper UI at query.wikidata.org shows example queries for common domains — modify those rather than starting from scratch.
schema:name "Olympians"@en, "Caelestes"@la — a query that binds ?name to that predicate without a LANG filter returns two rows per resource. Aggregations double. Counts are wrong. Always add FILTER (LANG(?name) = "en") when the data has language-tagged literals and you only want one language. The workbook drills this repeatedly.
OPTIONAL { ?ninja sensemaking:hasJutsu ?j } keeps the ninja in the result with ?j unbound if they have no jutsu. But if they have two jutsu, they appear twice — once per jutsu. Row multiplication is the most common OPTIONAL surprise. Use GROUP_CONCAT or restructure to avoid it.
_:b0) in query results cannot be referenced in subsequent queries. If your data uses blank nodes for intermediate structure (anonymous Employment events, reification nodes), you will see them in DESCRIBE output but cannot construct further patterns from them without going through their connecting predicates first.
sensemaking:rivalOf as owl:SymmetricProperty, a plain SPARQL query against the raw data sees only explicitly asserted triples. The symmetric inverse is not there unless you either materialize it with CONSTRUCT, enable an OWL-aware reasoner, or add both directions to the data. Module 3 covers the reasoning layer. For now: what SPARQL sees is what is in the store, nothing more.
The Sub-module 1.2 workbook (1-2-workbook-naruto.html or the mythology variant) has five queries that go beyond the 1.1 workbook: DESCRIBE, UNION, property paths, FILTER NOT EXISTS, and VALUES. Start Fuseki first, then work through each query card before opening the "Think about this" prompts. Prediction before execution is the habit that makes SPARQL stick.
DuCharme's Learning SPARQL chapters 1–2 cover the foundations with more examples than this page has room for. The W3C SPARQL 1.1 spec's overview section is denser but authoritative on edge cases. And the Wikidata query service helper UI has dozens of working example queries to study and modify.