[Mulgara-dev] Re: Blank nodes in queries (was: Recent commits to Mulgara.)

Andrae Muys andrae at netymon.com
Mon Oct 9 12:35:15 UTC 2006


On 09/10/2006, at 8:56 PM, Seaborne, Andy wrote:
>> From: Andrae Muys <mailto:andrae at netymon.com>
>> On 07/10/2006, at 3:27 AM, Seaborne, Andy wrote:
>>
> <snip/>
>>
>> Considering the query you included below:
>>
>> SELECT ?x WHERE { ?x :p ?y }
>>
>> The query pattern is:                ?x :p ?y
>> The distinguished variables are:     ?x
>> The non-distinguished variables are: ?y
>
> SPARQL also makes it a little bit more difficult - there's three  
> species
> of variable.  SPARQL is an algebra over basic graph pattern  
> matching so
> a variable can be non-distinguished and used in only one basic graph
> pattern or non-distinguished but used in two or more basic graph
> patterns, or a filter.  In the latter case, it will need a value,  
> in the
> former, it does not need a binding.

I don't see how the one-graph pattern case differs from the multi- 
graph pattern case.  However I do recognise your difficulty with  
filter.  The variables in a graph pattern are existential  
placeholders subject to unification with the targeted rdf graph - the  
unification generating a set of bindings that constitute the result.   
FILTER is defined as a predicate *function*, consequently it's  
variables are not existentials, but rather formal parameters.  I feel  
the decision to introduce this distinction was both unnecessary and  
unfortunate, however it has been made, and we therefore must  
recognise the distinction when dealing with SPARQL query semantics.

At this stage Mulgara does not support FILTER in this manner.  The  
subset we have implemented to date is also expressed in terms of  
unification, and consequently in all my email I use the term  
variables as existentials.  If I wish to discuss 'filter variables',  
I will use the terms 'parameters', or 'arguments'.

>> As far as I am concerned a blank node in a query is to be treated no
>> differently from any other node-type.  Given blank-node semantics,
> they
>> naturally carry identify only with respect to their containing graph.
>> Therefore any blank-node that is going to be included within
>> a query must have been obtained from that graph by a prior query.
>> Any blank-node that was not obtained from the graph being queried, by
>> definition cannot match a blank-node, and therefore any pattern
>> containing such a bnode will fail.
>
> A blank node is existentially quanitified just outside the graph -
>
> http://lists.w3.org/Archives/Public/public-rdf-dawg/2006JulSep/ 
> 0153.html
> and "What counts as an RDF graph":
> http://www.ihmc.us/users/phayes/RDFGraphSyntax.html
>
> My underatnding is that it has identity but it's not uniquely
> quanitified over the domain of discourse.

Thank-you for the links.  They have helped me clarify my  
understanding of blank-nodes.  Still while I see I should have  
expressed myself differently, the conclusion remains valid.  An  
anonymous blank-node in a graph-pattern is a unique resource that  
cannot unify with any other blank-node in the target graph (or in the  
query for that matter).  To include a blank-node from a graph in a  
query, this must have been assigned a 'name' (a bnodeid), however  
temporarily.  That name will have a scope, in the case of Mulgara  
that scope will be the current Transaction as we have no concept of a  
stable bnodeid across transactions.  However other databases may well  
choose to define alternative 'scopes' for their names.

I understand the terminology the DAWG is using for this concept is  
'told bnodes'.  Certainly that is the term used in the two links you  
provided.  'told bnodes' are the "single exception noted above" I  
mentioned.  Blank nodes within the target graph, having been provided  
with names whose scope includes the query in question, being  
referenced explicitly from within the query by use of those names.

>>> If a blank node is treated as purely and only existential, then it
>>> might be that block nodes need different treatment to named
> variables.
>>> On the
>>> other hand, and certainly for access RDF graph with simple
> entailment,
>>> it might be that systems are not forbidden in treating blank nodes
> in
>>> queries as implicit anonymous variables with the same semantics as
>>> other variables.

However this does not appear to be describing 'told bnodes'.  Rather  
it appears to be conflating blank-nodes with anonymous variables.   
The problem with this is that blank-nodes are already defined to have  
a global scope, and to be distinct.  It appears you are wanting  
something along the following lines:

Given

_:1 :p :a
_:2 :p :b

and the query pattern:

{ _:5 :p ?x }

I'm interpreting the above paragraph to intend the result to be:

{ { ?x -> :a }, { ?x -> :b } }

which would imply that _:1 === _:5 === _:2 or in other words _:1 ===  
_:2 - which is clearly invalid.

On the other hand I could see the following query pattern:

{ ?_ :p ?x }

returning the above result.  If what you want is anonymous variables  
then you might as well use the same syntactic conventions as every  
other language that supports pattern-matching and/or unification, and  
simply use underscore.  I have every intent of providing $_ for  
Mulgara at some stage for precisely this purpose (and since we  
already support [ ... ] patterns, it's probably only about a days work).

> Ah - I see - the blank node need not be identified as a blank node  
> but,
> as an existenial, may be matched to a term in the graph or indeed
> (OWL/DL) with the existence of something with no materialised term.
>
> An OWL/DL system could create a term for that thing - it just that  
> they
> don't.  It can be tricky in the case of co-reference but as the
> existentials are not distinguihed that information is lost anyway.

I definately don't understand you here.  My understanding of a blank  
node is as an existential, so how should it be identified '[not] as a  
blank node but as an existential'?  I'd also like to check that by  
"indeed (OWL/DL) ... no materialised term", you are refering to  
virtual and/or inferred graphs?  Certainly Mulgara makes extensive  
use of what I believe you refer to as 'non-materialised terms', I  
refer here to the XSD resolver in particular, as well as resolvers in  
general.  I really must get around to finishing the arithmetic  
resolver, and I think I know how to implement a regular-expression  
resolver as well.

Could you elaborate on the co-reference case?  I'm curious what this  
is referring to.

Andrae

-- 
Andrae Muys
andrae at netymon.com
Principal Mulgara Consultant
Netymon Pty Ltd





More information about the Mulgara-dev mailing list