[Mulgara-dev] No such variable

Fri May 2 10:36:10 UTC 2008

On 02/05/2008, at 2:41 AM, Paul Gearon wrote:
> Moving this back to the mailing list, as per Amit's request (don't
> worry people, we didn't discuss much that wasn't already in public).
>
> OK, I've had some sleep. Let's see if I can write something more
> comprehensible.  :-)
>
> On Thu, May 1, 2008 at 2:06 AM, Andrae Muys <andrae at netymon.com>  
> wrote:
>>  The answer of course is, of course it 'works' - but it might not be
>> sufficient to let you get a full sparql implementation out.
>
> Actually, SPARQL is all about just making it "work".  :-)  It's not
> like there's a strong theoretical underpinning to it. So if it works,
> then I'm all for it.
>
> I'm more concerned about this disjunction issue. It's the same bug as
> what affects OPTIONAL, but it is a more fundamental operation. I know
> that Simon agonized about the correct behaviour of incompatible
> unions, and I think others (like yourself?) also had discussions on
> it. I can't say that allowing incompatible joins to occur was
> necessarily a great idea, but I was certainly grateful for this
> decision when it came to SPARQL. :-)

Actually we only agonized over disjunctions until Simon managed to  
identify the isomorphism with FoL.  At that point it ceased to be a  
problem because the correct behaviour became obvious and  
indisputable.  The isomorphism makes clear two things.

1. There is no such thing as 'union-incompatible' tuples; so the term  
became purely a way of describing the phenomenon.

2. UNBOUND is not a value; which immediately solved the type- 
theoretic problems that NUC-Disjunction poses.

Of course SPARQL came along and screwed up big time by defining  
UNBOUND as a value.  I'm pretty certain this cannot be made to work  
cleanly if it can be made to work at all.  It is one of the biggest  
screw-ups in a standard that is rapidly becoming renowned as screwed up.

>>  If that is the case then we need to think about how we handle  
>> such things -
>> I'm pretty sure there is a work around, but I'm going to need a  
>> better idea
>> of what the precise problem is first.  I suspect if necessary we  
>> can get
>> around the problem by preallocating a special value to represent
>> SPARQL-UNBOUND and manage it explicitly, but to be sure I'll need  
>> to double
>> check.  From my understanding the root of the problem is that  
>> SPARQL decided
>> to define a new 'value' called UNBOUND whereas Mulgara took the  
>> sensible
>> approach and defined a 'marker' called UNBOUND.  Hence my  
>> suspicion that we
>> need to create a value to handle this.
>>
>>  What you're not going to be able to convince me to support is  
>> converting
>> Mulgara to having UNBOUND-value semantics.
>
> That's OK, after reading your post, I've been thinking it over, and I
> agree with you now.  :-)  I suppose I should have been thinking more
> along these lines, as I was very grateful recently when I realized
> that the distributive properties permitted in our grammar permitted me
> to do a lot of fancy things to make named graphs work.
>
> Now, you may disagree with me here, but I believe we're going to be
> safe if we handle this issue in the projection operation. My reasoning
> here is because the projection is completely independent of the
> algebra.
>
> Now a projection is currently a subset of the variables in the result.
> From the perspective of a projection, we may be able to consider the
> Tuples as an infinite set of variables with a limited number of known
> bindings (since it's RDF, there are infinite bindings, but we only
> know about a finite number of them). For any particular binding
> context, some of the variables are set to values, and others are set
> to UNBOUND (returned from Answer as "null"). Because the unknown
> bindings are infinite, then I believe it is valid to consider all
> variables, since they could be bound at some point - we just don't
> know about it in our answer. A concrete example is in James's query
> where knowing about some data binds certain variables, but a smaller
> set of data fails to bind those variables.

As we discussed offline, if you are talking about Tuples this is a  
problem as infinite sets cause problems for the algebra.  You are  
going to run into trouble trying to unify the concept of an UNBOUND  
marker, and the range of these infinite sets of bindings.   
Implementing infinite binding sets like this is also likely to cause  
difficulties.

> At the moment, we only permit projections to select from those
> variables that get bound at some point, and we throw an exception if
> we try to select variables that are never bound. We do it this way,
> because we ask the tuples that we are projecting for its variables,
> and it only knows about the variables that get bound at some point.
> However, we can consider this tuples to just be a subset of the entire
> space. The projection can then "virtually " fill in the rest of this
> space for us. ie. when an unknown column is requested the projection
> returns UNBOUND.

We already draw a distinction between Answers and Tuples.  Tuples are  
objects within an algebra.  Answers are globalized Tuples, but are  
primarily an API concern.  Allowing Answers to have DONT_CARE columns  
is alright, introducing them into Tuples though will cause problems.   
Be aware that the semantic of UNBOUND in Mulgara is DONT_CARE.  It is  
not a value, it is not a NULL, it is simply a marker that indicates  
that for this product this variable binding can take on any value  
without restriction without affecting the truth value of the tuples  
predicate.  Hence the use of the term 'unconstrained' for a Tuples  
with no variables and one row.

Andrae

-- 
Andrae Muys
andrae at netymon.com
Senior RDF/Semantic-Web Consultant
Netymon Pty Ltd