SPARQL: the W3 Query Language on RDF

Select all subjects (x) such that x <http://www.w3.org/2001/vcard-rdf/3.0#FN> "John Smith"

SELECT ?x
WHERE { ?x  <http://www.w3.org/2001/vcard-rdf/3.0#FN>  "John Smith" }

where ?x represents a variable called x. The ? does not form part of the name which is why it does not appear in the table output.

Here is another:

SELECT ?x ?fname
WHERE {?x  <http://www.w3.org/2001/vcard-rdf/3.0#FN>  ?fname}

Yet another:

SELECT ?givenName
WHERE
  { ?y  -<http://www.w3.org/2001/vcard-rdf/3.0#Family>  "Smith" .
    ?y  -<http://www.w3.org/2001/vcard-rdf/3.0#Given>  ?givenName .
  }


      


      



      

Solutions

Query solutions are a set of pairs of a variable name with a value. A SELECT query directly exposes the solutions (after order/limit/offset are applied) as the result set - other query forms use the solutions to make a graph. The solution is the way the pattern matched - which values the variables must take for a pattern to match.

Basic Patterns

QNames

Consider:

SELECT ?givenName
WHERE
  { ?y  -<http://www.w3.org/2001/vcard-rdf/3.0#Family>  "Smith" .
    ?y  -<http://www.w3.org/2001/vcard-rdf/3.0#Given>  ?givenName .
  }

There is shorthand mechanism for writing long URIs using prefixes. The query above is more clearly written as

PREFIX vcard:      -<http://www.w3.org/2001/vcard-rdf/3.0#>

SELECT ?givenName
WHERE
 { ?y vcard:Family "Smith" .
   ?y vcard:Given  ?givenName .
 }

This is a prefixing mechanism - the two parts of the URIs, from the prefix declaration and from the part after the “:” in the qname, are concatenated together. This is strictly not what an XML qname is but uses the RDF rule for turning a qname into a URI by concatenating the parts.

Blank Nodes

Change the query just a little to return y as well

PREFIX vcard:      <http://www.w3.org/2001/vcard-rdf/3.0#>

SELECT ?y ?givenName
WHERE
 { ?y vcard:Family "Smith" .
   ?y vcard:Given  ?givenName .
 }

and the blank nodes appear. y is bound to _:b0 and _:b0 in a given implementation, or generally an odd looking qname starting with _:


          


          


        

SPARQL Filters

String Matching

SPARQL provides an operation to test strings, based on regular expressions. This includes the ability to ask SQL “LIKE” style tests, although the syntax of the regular expression is different from SQL.

The syntax is:

FILTER regex(?x, "pattern" [, "flags"])

The flags argument is optional. The flag “i” means a case-insensitive pattern match is done.

The example query finds given names with an “r” or “R” in them.

PREFIX vcard: <http://www.w3.org/2001/vcard-rdf/3.0#>

SELECT ?g
WHERE
{ ?y vcard:Given ?g .
  FILTER regex(?g, "r", "i") }

The regular expression language is the same as the XQuery regular expression language which is codified version of that found in Perl.

Testing Values

There are times when the application wants to filter on the value of a variable. In the data file vc-db-2.rdf, we have added an extra field for age. Age is not defined by the vCard schema, so we have created a new property for the purpose of this tutorial. RDF allows such mixing of different definitions of information because URIs are unique. Note also that the info:age property value is typed.

In this extract of the data, we show the typed value. It can also be written plain 23.

<http://somewhere/RebeccaSmith/>
    info:age "23"^^xsd:integer ;
    vCard:FN "Becky Smith" ;
    vCard:N [ vCard:Family "Smith" ;
              vCard:Given  "Rebecca" ] .

So, a query (q-f2.rq) to find the names of people who are older than 24 is:

PREFIX info: <http://somewhere/peopleInfo#>

SELECT ?resource
WHERE
  {
    ?resource info:age ?age .
    FILTER (?age >= 24)
  }

Optional Information

OPTIONAL is a binary operator that combines two graph patterns. The optional pattern is any group pattern and may involve any SPARQL pattern types. If the group matches, the solution is extended, if not, the original solution is given.

The following query gets the name of a person and also the age if that piece of information is available.

PREFIX info:    <http://somewhere/peopleInfo#>
PREFIX vcard:   <http://www.w3.org/2001/vcard-rdf/3.0#>

SELECT ?name ?age
WHERE
{
    ?person vcard:FN  ?name .
    OPTIONAL { ?person info:age ?age }
}

If only some of the people in the data have age properties, then their matching query solutions will have that information. However, because the triple pattern for the age is optional, there is a pattern solution for the remaining people who don’t have age information.


If the optional clause had not been there, no age information would have been retrieved. If the triple pattern had been included but had not been made optional, as in the query:

PREFIX info:   <http://somewhere/peopleInfo#>
PREFIX vcard:  <http://www.w3.org/2001/vcard-rdf/3.0#>

SELECT ?name ?age
WHERE
{
    ?person vcard:FN  ?name .
    ?person info:age ?age .
}

then we would only get solutions for people with a stated age.

Optionals with FILTERs

The following query will get a person's name and age. The age will be included if available and if greater than 24.

PREFIX info:        <http://somewhere/peopleInfo#>
PREFIX vcard:      <http://www.w3.org/2001/vcard-rdf/3.0#>

SELECT ?name ?age
WHERE
{
    ?person vcard:FN  ?name .
    OPTIONAL { ?person info:age ?age . FILTER ( ?age > 24 ) }
}

If the filter condition is moved out of the optional part, then it can influence the number of solutions, but it may be necessary to make the filter more complicated to allow for variable age being unbound.

PREFIX info:        <http://somewhere/peopleInfo#>
PREFIX vcard:      <http://www.w3.org/2001/vcard-rdf/3.0#>

SELECT ?name ?age
WHERE
{
    ?person vcard:FN  ?name .
    OPTIONAL { ?person info:age ?age . }
    FILTER ( !bound(?age) || ?age > 24 )
}

Evaluating an expression which has an unbound variables where a bound one was expected causes an evaluation exception and the whole expression fails.

Optionals and Other Dependency Queries

One thing to be careful of is using the same variable in two or more optional clauses (and not in some basic pattern as well):

PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX vCard: <http://www.w3.org/2001/vcard-rdf/3.0#>

SELECT ?name
WHERE
{
  ?x a foaf:Person .
  OPTIONAL { ?x foaf:name ?name }
  OPTIONAL { ?x vCard:FN  ?name }
}

If the first optional binds ?name and ?x to some values, the second OPTIONAL is an attempt to match the ground triples (?x and ?name have values). If the first optional did not match the optional part, then the second one is an attempt to match its triple with two variables.

Union Queries*