[Mulgara-dev] Re: Blank nodes in queries (was: Recent commits to Mulgara.)
Andrae Muys
andrae at netymon.com
Mon Oct 9 12:35:15 UTC 2006
On 09/10/2006, at 8:56 PM, Seaborne, Andy wrote:
>> From: Andrae Muys <mailto:andrae at netymon.com>
>> On 07/10/2006, at 3:27 AM, Seaborne, Andy wrote:
>>
> <snip/>
>>
>> Considering the query you included below:
>>
>> SELECT ?x WHERE { ?x :p ?y }
>>
>> The query pattern is: ?x :p ?y
>> The distinguished variables are: ?x
>> The non-distinguished variables are: ?y
>
> SPARQL also makes it a little bit more difficult - there's three
> species
> of variable. SPARQL is an algebra over basic graph pattern
> matching so
> a variable can be non-distinguished and used in only one basic graph
> pattern or non-distinguished but used in two or more basic graph
> patterns, or a filter. In the latter case, it will need a value,
> in the
> former, it does not need a binding.
I don't see how the one-graph pattern case differs from the multi-
graph pattern case. However I do recognise your difficulty with
filter. The variables in a graph pattern are existential
placeholders subject to unification with the targeted rdf graph - the
unification generating a set of bindings that constitute the result.
FILTER is defined as a predicate *function*, consequently it's
variables are not existentials, but rather formal parameters. I feel
the decision to introduce this distinction was both unnecessary and
unfortunate, however it has been made, and we therefore must
recognise the distinction when dealing with SPARQL query semantics.
At this stage Mulgara does not support FILTER in this manner. The
subset we have implemented to date is also expressed in terms of
unification, and consequently in all my email I use the term
variables as existentials. If I wish to discuss 'filter variables',
I will use the terms 'parameters', or 'arguments'.
>> As far as I am concerned a blank node in a query is to be treated no
>> differently from any other node-type. Given blank-node semantics,
> they
>> naturally carry identify only with respect to their containing graph.
>> Therefore any blank-node that is going to be included within
>> a query must have been obtained from that graph by a prior query.
>> Any blank-node that was not obtained from the graph being queried, by
>> definition cannot match a blank-node, and therefore any pattern
>> containing such a bnode will fail.
>
> A blank node is existentially quanitified just outside the graph -
>
> http://lists.w3.org/Archives/Public/public-rdf-dawg/2006JulSep/
> 0153.html
> and "What counts as an RDF graph":
> http://www.ihmc.us/users/phayes/RDFGraphSyntax.html
>
> My underatnding is that it has identity but it's not uniquely
> quanitified over the domain of discourse.
Thank-you for the links. They have helped me clarify my
understanding of blank-nodes. Still while I see I should have
expressed myself differently, the conclusion remains valid. An
anonymous blank-node in a graph-pattern is a unique resource that
cannot unify with any other blank-node in the target graph (or in the
query for that matter). To include a blank-node from a graph in a
query, this must have been assigned a 'name' (a bnodeid), however
temporarily. That name will have a scope, in the case of Mulgara
that scope will be the current Transaction as we have no concept of a
stable bnodeid across transactions. However other databases may well
choose to define alternative 'scopes' for their names.
I understand the terminology the DAWG is using for this concept is
'told bnodes'. Certainly that is the term used in the two links you
provided. 'told bnodes' are the "single exception noted above" I
mentioned. Blank nodes within the target graph, having been provided
with names whose scope includes the query in question, being
referenced explicitly from within the query by use of those names.
>>> If a blank node is treated as purely and only existential, then it
>>> might be that block nodes need different treatment to named
> variables.
>>> On the
>>> other hand, and certainly for access RDF graph with simple
> entailment,
>>> it might be that systems are not forbidden in treating blank nodes
> in
>>> queries as implicit anonymous variables with the same semantics as
>>> other variables.
However this does not appear to be describing 'told bnodes'. Rather
it appears to be conflating blank-nodes with anonymous variables.
The problem with this is that blank-nodes are already defined to have
a global scope, and to be distinct. It appears you are wanting
something along the following lines:
Given
_:1 :p :a
_:2 :p :b
and the query pattern:
{ _:5 :p ?x }
I'm interpreting the above paragraph to intend the result to be:
{ { ?x -> :a }, { ?x -> :b } }
which would imply that _:1 === _:5 === _:2 or in other words _:1 ===
_:2 - which is clearly invalid.
On the other hand I could see the following query pattern:
{ ?_ :p ?x }
returning the above result. If what you want is anonymous variables
then you might as well use the same syntactic conventions as every
other language that supports pattern-matching and/or unification, and
simply use underscore. I have every intent of providing $_ for
Mulgara at some stage for precisely this purpose (and since we
already support [ ... ] patterns, it's probably only about a days work).
> Ah - I see - the blank node need not be identified as a blank node
> but,
> as an existenial, may be matched to a term in the graph or indeed
> (OWL/DL) with the existence of something with no materialised term.
>
> An OWL/DL system could create a term for that thing - it just that
> they
> don't. It can be tricky in the case of co-reference but as the
> existentials are not distinguihed that information is lost anyway.
I definately don't understand you here. My understanding of a blank
node is as an existential, so how should it be identified '[not] as a
blank node but as an existential'? I'd also like to check that by
"indeed (OWL/DL) ... no materialised term", you are refering to
virtual and/or inferred graphs? Certainly Mulgara makes extensive
use of what I believe you refer to as 'non-materialised terms', I
refer here to the XSD resolver in particular, as well as resolvers in
general. I really must get around to finishing the arithmetic
resolver, and I think I know how to implement a regular-expression
resolver as well.
Could you elaborate on the co-reference case? I'm curious what this
is referring to.
Andrae
--
Andrae Muys
andrae at netymon.com
Principal Mulgara Consultant
Netymon Pty Ltd
More information about the Mulgara-dev
mailing list