[Mulgara-dev] No such variable

Paul Gearon gearon at ieee.org
Thu May 1 16:41:33 UTC 2008


Moving this back to the mailing list, as per Amit's request (don't
worry people, we didn't discuss much that wasn't already in public).

OK, I've had some sleep. Let's see if I can write something more
comprehensible.  :-)

On Thu, May 1, 2008 at 2:06 AM, Andrae Muys <andrae at netymon.com> wrote:
>  The answer of course is, of course it 'works' - but it might not be
> sufficient to let you get a full sparql implementation out.

Actually, SPARQL is all about just making it "work".  :-)  It's not
like there's a strong theoretical underpinning to it. So if it works,
then I'm all for it.

I'm more concerned about this disjunction issue. It's the same bug as
what affects OPTIONAL, but it is a more fundamental operation. I know
that Simon agonized about the correct behaviour of incompatible
unions, and I think others (like yourself?) also had discussions on
it. I can't say that allowing incompatible joins to occur was
necessarily a great idea, but I was certainly grateful for this
decision when it came to SPARQL. :-)

>  If that is the case then we need to think about how we handle such things -
> I'm pretty sure there is a work around, but I'm going to need a better idea
> of what the precise problem is first.  I suspect if necessary we can get
> around the problem by preallocating a special value to represent
> SPARQL-UNBOUND and manage it explicitly, but to be sure I'll need to double
> check.  From my understanding the root of the problem is that SPARQL decided
> to define a new 'value' called UNBOUND whereas Mulgara took the sensible
> approach and defined a 'marker' called UNBOUND.  Hence my suspicion that we
> need to create a value to handle this.
>
>  What you're not going to be able to convince me to support is converting
> Mulgara to having UNBOUND-value semantics.

That's OK, after reading your post, I've been thinking it over, and I
agree with you now.  :-)  I suppose I should have been thinking more
along these lines, as I was very grateful recently when I realized
that the distributive properties permitted in our grammar permitted me
to do a lot of fancy things to make named graphs work.

Now, you may disagree with me here, but I believe we're going to be
safe if we handle this issue in the projection operation. My reasoning
here is because the projection is completely independent of the
algebra.

Now a projection is currently a subset of the variables in the result.
>From the perspective of a projection, we may be able to consider the
Tuples as an infinite set of variables with a limited number of known
bindings (since it's RDF, there are infinite bindings, but we only
know about a finite number of them). For any particular binding
context, some of the variables are set to values, and others are set
to UNBOUND (returned from Answer as "null"). Because the unknown
bindings are infinite, then I believe it is valid to consider all
variables, since they could be bound at some point - we just don't
know about it in our answer. A concrete example is in James's query
where knowing about some data binds certain variables, but a smaller
set of data fails to bind those variables.

At the moment, we only permit projections to select from those
variables that get bound at some point, and we throw an exception if
we try to select variables that are never bound. We do it this way,
because we ask the tuples that we are projecting for its variables,
and it only knows about the variables that get bound at some point.
However, we can consider this tuples to just be a subset of the entire
space. The projection can then "virtually " fill in the rest of this
space for us. ie. when an unknown column is requested the projection
returns UNBOUND.

We can also give ourselves a little more security in projecting these
unbound variables by making sure that the projection is only done on
variables in the original constraint expression (since this gets all
the variables from sub-expressions). If we're considering an infinite
set of variables, then this check isn't needed, but from a pragmatic
perspective it is a little safer.

So does this make sense Andrae? (and anyone else reading)

Regards,
Paul

P.S. Sorry if you got spam from me yesterday. A friend I haven't seen
in years asked me to catch up through a social networking site, and I
stupidly let it "check" for friends from an address book. It
immediately sent spam to every mailing list, tech support channel and
business contact I've ever had. Very embarrassing, and I offer my most
humble apologies. (I also scrapped the account with the site).



More information about the Mulgara-dev mailing list