[Mulgara-dev] Blank Node Assignment in Inserts

Fri Feb 29 04:52:58 UTC 2008

On 29/02/2008, at 2:11 PM, Life is hard, and then you die wrote:

> On Thu, Feb 28, 2008 at 02:54:41PM -0500, Alex Hall wrote:
>> Paul Gearon wrote:
>>> <snip description="Bug #81" href="http://mulgara.org/trac/ticket/ 
>>> 81"/>
> [snip]
>>> I'm inclined to fix it, but I'm interested in opinions here.
>>
>> I agree.  I will have potentially long-running transactions with
>> multiple insertions, and would prefer not to have to guarantee the
>> uniqueness of my variable names across the entire transaction.
>
> Agreed. We've currently got a hack where we have a counter associated
> with each transaction to ensure this uniqueness, and I wouldn't mind
> getting rid of it.

Ok, then it gets fixed.  Anyone who finds the current inadvertent  
behaviour useful should submit a feature request for us to find an  
alternative way of achieving the effect. Certainly any ability to  
unify blank-nodes across inserts in a single transaction should also  
allow us to use them in queries - so the current behaviour is only  
half a solution in any regard.

>> More than anything else, I think it is important to come up with a
>> consistent approach to handling the mapping of blank node labels to
>> internal node ID's throughout the software.  It just seems kind of  
>> silly
>> to maintain these blank node mappings in half a dozen different  
>> places.
>>   What this approach should be, I'm not sure, but I'm interested  
>> in more
>> opinions as well.
>
> Yes, a comprehensive solution would be nice, though I think it would
> basically involve assigning id's of some sort to the blank-nodes. We
> try to avoid blank-nodes as far as possible for this reason... (we
> only use them for the list nodes in rdf:List and rdf:Seq).
>
> So, I think going back to the old behaviour would be good, and at some
> point figure out how to do identification of blank-nodes across
> multiple operations including inserts, deletes, and queries.

svn annotate - identifies this code as very recent; it was added in  
revision 278 as part of the distributed resolver. The code in 277  
simply allocated a new nodeid and left any caching or blank-node  
equivalence to the caller (ie. the blank-node map in the  
ContentHandler that Alex noted).

I suspect that this code was added to handle  
org.mulgara.resolver.distributed.ForeignBlankNode's, and that  
VariableBlankNode's where caught by accident. BlankNodeImpl's are  
handled specifically by the StringPoolSession as they are our  
internal global representation, and so are a special case; but I  
don't believe we should be handling ForeignBlankNodes there, and  
would be interested to know if there is any reason why they couldn't  
be handled in DistributedResolver itself?

The consistent approach Alex asks for is currently that we represent  
blank-nodes internally as BlankNodeImpl's, and any other subclass of  
BlankNode that enters Mulgara should be mapped to one at its point of  
entry.  If we need to revisit this we will, but I am not yet  
convinced this approach has proven insufficient.

Andrae

-- 
Andrae Muys
andrae at netymon.com
Senior RDF/SemanticWeb Consultant
Netymon Pty Ltd