Anzo 3.0 includes the Glitter engine for executing SPARQL queries against the Anzo RDF repository or other underlying data sources.
Glitter supports nearly the full SPARQL specification, lacking only some of the mandated datatype cast functions. In addition, in Anzo 3.0 Glitter provides the following SPARQL extensions:
Functional predicates
A functional predicate (also known in some circles as property functions or "magic predicates") allows a predicate within a triple pattern to have a special meaning. Glitter supports two property functions:
textlike
The predicate http://openanzo.org/predicates/textlike can be used within a SPARQL query to find literals that match a certain wildcard pattern. This is similar to and uses the same syntax as SQL's % operator. For example, the following query finds articles whose title contains the word Anzo:
PREFIX fp: <http://openanzo.org/predicates/>
...
SELECT ?title {
?article dc:title ?title .
?title fp:textlike "%Anzo%" .
}
The same query could be written using SPARQL's standard regex filter function:
SELECT ?title {
?article dc:title ?title .
FILTER(regex(str(?title), "Anzo")) .
}
In Anzo 3.0, however, regex acts on an in-memory result set whereas the textlike functional predicate is executed directly against the underlying relational database. As such, most simple text-search queries are more efficient when using textlike.
textmatch
The predicate http://openanzo.org/predicates/textmatch allows a SPARQL query access to Anzo's text-indexing capabilities. (@@ link to text-indexing docs needed) textmatch is similar to textlike in that it relates a literal value to a search string; however, textmatch uses specialized text indexes that allow a broader range of search options. Further, because textmatch is executed against a specialized text index that also contains 'nearby' subject and property data, queries using textmatch can often be executed without going to the underlying database at all.
@@ example of more complex search using textmatch
SELECT expressions
In addition to selecting the values of variables bound in a SPARQL query, Glitter supports selecting calculated values. Glitter extends SPARQL's SELECT clause to take parenthesized expressions that specify how to compute a projected value. An expression is followed by the AS keyword and then a variable that will be used to store the computed values in the query's result set.
For example, SELECT expressions can be used to calculate the cost of the line items of a particular order:
PREFIX ex: <...>
SELECT ?item ?quantity (?quantity * ?unitprice AS ?cost)
{
?order a ex:Order ;
ex:item ?item ;
ex:qty ?quantity .
?item ex:price ?unitprice .
}
Each result from this query would contain bindings for three variables: ?item (the URI of the item being ordered), ?quantity (the number of units of the item being ordered), and ?cost (the total cost of this item in this order, calculated as the quantity multiplied by the unit cost of the item).
SPARQL functions (including extension functions) can also be used. The following query finds all literals in the dataset and breaks them into their lexical form, datatype and language:
SELECT DISTINCT (str(?literal) AS ?lexical) (datatype(?literal) AS ?datatype) (lang(?literal) AS ?lang)
{
?s ?p ?literal .
}
Aggregates and GROUP BY
Glitter supports aggregate queries: queries that group sets of results together and return one row for each group of results. Such queries can also project the results of aggregate functions, functions that act on an entire group of results to produce a value. Glitter treats a query as an aggregate query if either:
- It contains an explicit GROUP BY clause
- Its SELECT clause contains at least one expression that involves an aggregate function
If an aggregate query does not contain a GROUP BY clause then the entire (pre-aggregated) result set is treated as a single group of results. In this case, the aggregate query's result set will have a single row representing the aggregated results of the entire query. When there is no GROUP BY clause, each element in the SELECT clause must involve an aggregate function.
Alternatively, the GROUP BY clause can list one or more variables that are used to break a (pre-aggregated) result set into multiple groups of results. There will be one group of results for each distinct combination of bindings for the variables in the GROUP BY clause. The GROUP BY clause is added after the query pattern (the WHERE clause) and before any ORDER BY clause. A query with a GROUP BY clause may project out variables (and expressions involving variables) that are mentioned in the GROUP BY clause.
Glitter supports the following aggregate functions:
COUNT
COUNT returns the number of solutions in a group.
For example, if:
SELECT ?dept ?title
WHERE {
?dept foaf:member ?person .
?person foaf:title ?title .
returns
{
{?dept='engineering', ?title="manager"},
{?dept='engineering', ?title="manager"},
{?dept='engineering', ?title="engineer"},
{?dept='engineering', ?title="engineer"}
}
then
SELECT ?dept ?title (COUNT(*) AS ?count)
WHERE {
?dept foaf:member ?person .
?person foaf:title ?title .
}
GROUP BY ?dept ?title
would return
{
{?dept='engineering', ?title="manager", ?count=2},
{?dept='engineering', ?title="engineer", ?count=2}
}
There are a few different versions of COUNT:
- COUNT(*) (as above) returns the number of solutions in each group
- COUNT(?var) returns the number of solutions in the group in which ?var is bound
- COUNT(DISTINCT *) returns the number of distinct solutions in each group
- COUNT(DISTINCT ?var1 ?var2 ?var3) returns the number of distinct combinations of the three variables in each group, excluding any solutions in which all three variables are unbound
Query validity
Glitter rejects some queries that are legal according to the SPARQL specification but are likely to be incorrectly written or would have unexpected results.
The following query is rejected because the variable ?x does not appear anywhere in the query pattern:
SELECT ?x ?y ?z
{
?y a ?z .
}
The following query is rejected because it is an aggregate query that attempts to project a variable that is not part of the result grouping:
PREFIX ex: <...>
SELECT ?customer ?order COUNT(*)
{
?customer ex:order ?order .
} GROUP BY ?customer


