[Mulgara-general] Jena-Mulgara connector

Fri Jan 11 04:12:15 UTC 2008

On 04/01/2008, at 1:39 AM, Seaborne, Andy wrote:
> It's also "interesting" if you create a blank node by some small  
> integer and see what node in the server graph you have really named.
> id=1 seems to rdf:type.

I wouldn't rely on that if I were you :).  There are various points  
in the database bootstrap process where common uris are preallocated  
permanent ids - purely as an optimisation (believe it or not  
localizing rdf:type floated to the top of a profile for a given class  
of query at one point).  It just happens that rdf:type is the first  
common uri we preallocate - hence it currently gets '1'.  If that  
order should change for some reason, it would result in a different  
small positive integer.

>> I always get around it by doing one of two things:
>> - Perform add/deletes as insert...select and delete...select  
>> commands.
>> - Wrap all my read/writes in a transaction.  That means I'm the only
>>  writer, so I know the ID can't change.  Unfortunately, once we get
>> multiple writers (some time in the future still) then this won't  
>> hold.
>> Also, if I'm using TQL then I have to perform insert...select and
>> delete...select operations still, as this is the only way to refer to
>> the blank nodes.

Probably worth noting here that even with multiple writers this will  
still hold.

>> Of course, if I'm doing insert...select and delete...select then I  
>> need
>> some guaranteed way of identifying the correct node.  IMO this is
>> almost always doable if you have knowledge of the schema you are  
>> using,
>> but there is no general mechanism.  The only case where you can't  
>> get a
>> particular blank node is if it's indistinguishable from another... in
>> which case it doesn't matter when one you get.  :-)
>
> It's the lack of general mechanism that's the issue here.  In  
> trying to write a general connector, as with writing any library  
> code, the hope is that the application assumptions don't have to  
> run all the way from top to bottom.

If the problem reusing a blanknode from a query for an insert, as  
long as you perform the query within the same write-phase you will be  
fine.  Mulgara's understanding of insert/delete is as functions from  
graphs to graphs, ie:  insert :: Triple -> Graph -> Graph.

In other words our current semantic is that the result of an update  
operation on a graph is a new graph, and as such blank-nodes from the  
pre-update graph are not necessarily the same as the blank-nodes on  
the post-update graph.

So a mulgara instance consists of a sequence of 4-uniform hypergraphs  
defined as the graphs produced by a sequence of unification and  
restriction operations starting with an empty graph, and we call the  
resulting ordinal number of a given graph a phase.

This idea of graph immutability (operations don't change the graph,  
but produce a new one) has formed the basis of our thinking regarding  
blank-nodes and their interaction with transactions and queries.  As  
a result your blank nodes remain valid for the duration of a  
transaction, for the moment that means holding the write-lock,  
however over the next few weeks we will be introducing the ability to  
obtain read-only transactions which will help alleviate some of your  
blank-node problems.

> My canonical use case is an RDF (or OWL) editor.  These tend to  
> regard the graph as a syntactic entity, especially when it is  
> between consistent states.  And those RDF collections are always  
> there to keep us on our toes. Viewing the graph syntactically is  
> just the same as thinking of being inside the graph, not querying  
> from the outside.
>
> This use case isn't the primary reason for wanting the connector  
> so, by above all, documentation for correct use is needed.

For now if you want to perform update operations you have to hold the  
write-lock.  Shortly you will be able to round-trip blank-nodes while  
browsing the graph within a single read-only transaction.  For this  
to be transparent requires us to eliminate the write-lock, ie.  
providing support for multiple-writers.  This is planned but isn't  
feasible in the short-term.

>>> One issue I encountered: Session.modelExists() for
>>> RemoteSessionWrapperSession throws an exception if the model has  
>>> never
>>> existed but returns false if the model used to exist but has been
>>> dropped.
>>
>> This is a bug.  I thought I'd put the fix into the trunk, but I must
>> have missed it.
>
> I'll check the trunk - I have been using 1.1.1 distribution where  
> possible.

I should be fixed in trunk I know it's fixed in the various devel- 
branches, so it should have been merged across by now.

> Please do not tell Pat Hayes about the skolemization of the blank  
> nodes :-)

He says to a publicly archived mailing list ;)

Andrae

-- 
Andrae Muys
andrae at netymon.com
Senior RDF/SemanticWeb Consultant
Netymon Pty Ltd