[Mulgara-dev] Round-tripping Blank-nodes.

Andrae Muys andrae at netymon.com
Fri Mar 28 04:38:13 UTC 2008


On 28/03/2008, at 2:17 PM, Paul Gearon wrote:
>
>
> On Thu, Mar 27, 2008 at 10:51 PM, Andrae Muys <andrae at netymon.com>  
> wrote:
> On the other hand this does beg the question, how do you 'refer' to a
> blank-node?  Blank-node-ids aren't scoped globally, but instead to
> the graph - and as such do lack a global identifier.  It would
> however be consistent with my understanding of RDF semantics if we
> combine a graph-name (in our case a multi-graph-name[1]) with a blank-
> node-id to produce a global identifier.
>
> For the sake of utility, I imagine that standard vocabulary would  
> be desirable (eg _:123) but this will break down the moment we talk  
> to multiple servers in the same query, since each server is free to  
> allocate it's own version of _:123.  Do we want to allow the simple  
> form when talking with a single server?  Then we're using different  
> identifiers in different circumstances.

The standard bnode syntax is only feasible when the graph is  
implicit.  Consequently if we are to support round-tripping it cannot  
be via the standard syntax - there is no implicit graph in this  
case.  So we would have to come up with our own syntax for blank  
nodes that included an explicit graph-name.  Moreover as I discuss  
above, this must be an explicit name to an immutable graph - in ReST  
terms, this is a reference to a concrete resource, as opposed to a  
conceptual resource[0].

> The problems become more complex with multiple graphs on the same  
> server.  Theoretically, it is possible for the same blank node to  
> be present in more than one graph, but the semantics of queries  
> explicitly makes it impossible to insert a blank node from one  
> place and insert it somewhere else.  This means that an insert/ 
> select query MUST change blank node IDs if it goes into another  
> graph.... but SHOULDN'T change them if they're going into the same  
> graph.  In reality, many systems allow the former, and the latter  
> is next to impossible if the former is attempted.

As I mentioned, MUST is far too strong a term for this - SHOULD is  
the strongest you can legally use here, and MAY is probably more  
accurate. RDF Semantics are perfectly happy with a coordinated set of  
graphs sharing blank-nodes, and therefore we are perfectly entitled  
to provide a reference to them that satisfies blank-node semantics  
(note: *blank-node* semantics, *not* blank-node *id*s.).  On the  
other hand, I should note here that this only applies when we are  
talking about a blank-node reference returned from a query result -  
it does *NOT* apply as soon as the blank-node has been serialized as  
RDF/XML, N3, or similar.  At that point it's been inserted into a  
new, non-coordinated, graph and the id is then scoped strictly to  
that graph.

> To imagine the difficulty, think of doing a query from the union of  
> graphs A and B, and insert the results into B.  Blank nodes from B  
> shouldn't change, but blank nodes from A are supposed to change  
> before being written.

This isn't actually required by the spec.

Andrae

[0] Think "The file at timestamp", as opposed to "The file at now".

-- 
Andrae Muys
andrae at netymon.com
Senior RDF/SemanticWeb Consultant
Netymon Pty Ltd





More information about the Mulgara-dev mailing list