[Mulgara-dev] literal in gn2spoCache but cannot be found in backup file

Paul Gearon gearon at ieee.org
Thu Jan 24 20:46:47 UTC 2008


On Jan 23, 2008, at 1:47 PM, Ben Hysell wrote:

<snip/>

> For example, I’m searching for <uri1> <uri2> $object  in the good  
> Mulgara server I would get returned:
> <uri1> <uri2> literal
>
> In the bad Mulgara server I would get returned:
> <uri1> <uri2> _node#####
>
> I opened the backup file produced by the good Mulgara server and  
> found the triple that described my query at the bottom, found the  
> corresponding string pool entries for <uri1> and <uri2>, but was  
> unable to find the string pool entry for my literal.

OK, so this means that the backup process is corrupted somehow.


>   Concerned, I started poking around the query logic to discern why  
> Mulgara could return a literal on a query in the good Mulgara, but  
> unable to back it up.
>
> On line 1831 of XAStringPoolImpl.java:
>
> If (GN2SPO_CACHE_ENABLED) {
>    spObject = gn2spoCache.get(gNodeL);
> }
>
> Calling the gn2spoCache with the node number of my literal returns  
> the literal to the query performed in the webUI on the good Mulgara  
> server.
>
> However, if I query $subject $predicate literal on line 555 of  
> StringPoolSession.java in function localizeSPObject
>
> Long localNode = persistentStringPool.findGNode(relativeSPObject);
>
> Sets localNode = 0.  As the function progresses and checks  
> temporaryStringPool it also cannot find the node in there.

The cache that you are checking is a map from gNode to the  
corresponding SPObject, but the function that is failing you is trying  
to map an SPObject back to a gNode.

Incidentally, the mapping from SPObject to gNode is done via a tree,  
and the entries are removed if the object is removed.  However, the  
mapping from gNode back to SPObject is done with an array.  When an  
object is deleted, then the entry in the array is still there (I  
think).  So you can still map a gNode back to an SPObject in this  
case.  This is supposed to be safe, as nothing should be removed from  
the String Pool unless every usage of the gNode has also been removed.

When backing up, we iterate over the tree, meaning the corruption is  
possibly there.  The tree is used for localizing.

>   The function finishes by creating a node in the temporary string  
> pool. When we do arrive at ConstraintImpl on line 95 of  
> ConstraintImpl.java the ConstraintElement e2 has a value of -1.
>
> So to circle back, and please correct me if I am wrong in my  
> conclusions:
> 1.       I can back up the good Mulgara server, but when I restore  
> it the new Mulgara server has lots of blank nodes.
> 2.       It appears during the backup operation on the good Mulgara  
> server the call: Tuples t = stringPool.findGNodes(null,  
> null); ::line 179 of BackupOperation.java truly does not have my  
> literal in the string pool.
> 3.       My literal is still in the system if I query for it and  
> pull the node from the gn2spoCache
>
> Any thoughts or ideas on how we might fix this issue?  My thought  
> was to modify the backup code and pull data from the gn2spoCache and  
> dump the string pool from here, is this possible?

No, because the cache is just that.... a cache.  It won't contain  
everything.

The wrong way to fix this is to iterate over the array (the  
globalizing function) up to the latest gNode value for the Node Pool,  
and to store anything that is not a blank node.  Other than this  
storing things that are not committed in the latest phase (and also  
some new things that haven't been committed yet), it also completely  
ignores the fact that something has gone wrong with the String Pool.   
If it is not possible to localize an SPObject that exists, then we  
need to discover why.

Here are the possibilities that I can think of:
a) The SPObject was never inserted into the tree, though it was put  
into the array.
b) The SPObject was deleted from the tree when it was still in use in  
the triples.
c) There is some kind of phase mismatch, where the tree being backed  
up is not matching the triple indexes.
d) The SPObject *is* in the tree, but somehow is not being read.

Any one of these is worrisome.  I suspect (d), for reasons I'll get  
into soon.  Hopefully it is, as this is the easiest to fix.  :-)

Since you're debugging this, can I ask you to insert a statement with  
one of these missing literals, and to set a breakpoint on the  
localization of the literal please?  You can set it higher up in the  
call stack, but ultimately it should end up in  
XAStringPoolImpl.findGNode().  At about 1753 you should see the  
following lines:
         // Find the SPObject.
         findResult = avlFilePhase.find(objectPool, avlComparator,  
null);

findResult is an array.  Does it come back null, or with length 1 or  
2?  If it's null, then there's a big problem.  If it's of length 2,  
then the data isn't being found in the tree.  If it's of length 1,  
then it's being found.  I'm guessing you'll get a length of 2.

Presuming you're getting a length of 2 (data not found), then this  
might be because of a problem with type mismatches for literals  
(leading to my speculation of d, above).  Maybe something is going  
wrong with literal types for untyped literals, or xsd:string, or  
whatever it is that you are using?

Alternatively, you got a length of 1 (data found).  In that case, the  
problem is happening when retrieving during the backup.  But I'm going  
to leave speculation here, until you've had some time to check it out  
further.

Regards,
Paul Gearon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mulgara.org/pipermail/mulgara-dev/attachments/20080124/7fa70ea5/attachment.htm>


More information about the Mulgara-dev mailing list