[Mulgara-dev] CRITICAL: Bug fix to Backup operation
Ben Hysell
BenH at viewpointusa.com
Wed Mar 26 13:48:25 UTC 2008
>> 2. Read through the string pool looking for any entries that start
with _node followed by a number
>As discussed below, this step should *never* happen - even if you *are*
using blank-nodes.
But if a backup has node numbers in the triples section and not in the
string pool and we restore that backup and then query it the query will
return "_node######". If you then backup that Mulgara and look in the
string pool you'll then find: ## "_node######".
>Ok, this is cute - in every case the entry truly is a duplicate
>right? Same ID *and* same URI/Literal? We aren't talking about URI/
>Literals being mapped to multiple IDs or visa-versa right?
Yes, it is a duplicate on every occasion I have looked at it, except
one. I did have one instance where the duplicate node id referenced a
different string. I'm sorry I didn't set this backup aside so I could
reproduce it, our production system was down and I was rushing to find a
valid backup. I have only seen this once in the past week since we have
been tracking this issue.
>This is a concern - can you verify that the server1 directory does
>not contain any statements that contain blank-nodes? Specifically
>can you query for the 17k triples you have identified to check to see
>if they exist as blank-nodes in the store or just in the backup?
The rub is with the 17k is they are not blank when you query Mulgara,
they return valid strings for every single one of them. We do not use
blank nodes anywhere in our system, the only instance where we start
seeing them is after we do a backup/restore. This was the same issue I
was trying to track back in January in a series of emails on the list
titled "literal in gn2spoCache but cannot be foundinbackup file" where
if you query Mulgara you'll get a valid answer, try to find that string
in the string pool of the backup and it doesn't exist. Hence why we've
gone to the through checking of the backup files on every
backup/restore...if we get blank nodes in the restored Mulgara we have
now lost data.
-ben
-----Original Message-----
From: mulgara-dev-bounces at mulgara.org
[mailto:mulgara-dev-bounces at mulgara.org] On Behalf Of Andrae Muys
Sent: Wednesday, March 26, 2008 1:47 AM
To: Mulgara Developers
Subject: Re: [Mulgara-dev] CRITICAL: Bug fix to Backup operation
On 26/03/2008, at 7:43 AM, Ben Hysell wrote:
> We check for inconsistencies using the following methods with the
> decompressed backup file:
>
> 1. Read in the string pool and ensure there are not duplicate
> entries in
> the string pool, i.e. node numbers listed twice
>
> 2. Read through the string pool looking for any entries that start
> with
> _node followed by a number
As discussed below, this step should *never* happen - even if you
*are* using blank-nodes.
> 3. Look up each node number in the TRIPLES section to ensure there
> is a
> corresponding string pool entry for the node number in question.
>
> At one point someone had sent out on the list how to search your
> restored running Mulgara instance to check and see if any _node
> entries
> existed, however I've lost the email, and every time we ran the
> query we
> would crash Mulgara.
That would have involved using the NodeTypeResolver - one of Paul's
babies, he can probably reproduce the query.
> As for our testing:
>
> I took the server1 folder from production that was causing problems,
> copied over the new jar files, ran a backup and examined it using the
> steps from above.
>
> 1. I still have duplicate entries in the string pool, however this
> time
> they are grouped together, i.e. the one instance in this back up is
> node
> 6290, which is listed twice:
Ok, this is cute - in every case the entry truly is a duplicate
right? Same ID *and* same URI/Literal? We aren't talking about URI/
Literals being mapped to multiple IDs or visa-versa right?
> 2. There are no listings of _node in the backup file, this tells me
> the
> string pool is 'clean', where if I query the database I'll never
> return
> an entry that has _node followed by a large number.
Actually I suspect it doesn't. The backup file should never contain
any _node entries, these are identified as nodes in the TRIPLES
section that don't have corresponding entries in the RDFNODES
section. So in fact it is your item 3 that would tell you the string
pool is 'clean' - which apparently it is not.
> 3. There is roughly 17k triples that contain node numbers
> represented in
> the TRIPLES section that do not have corresponding string pool
> entries.
> If I restored this backup I would introduce 17k triples that would
> have
> one of the triples represented with _node.
This is a concern - can you verify that the server1 directory does
not contain any statements that contain blank-nodes? Specifically
can you query for the 17k triples you have identified to check to see
if they exist as blank-nodes in the store or just in the backup?
I need this so I know if the problem is with the backup code, or
somewhere else.
> 4. I did a backup of the same server1 with rev 570 with the following
> results:
>
> -there is roughly 17k node numbers in the TRIPLES section that do not
> have corresponding string pool entries
> -there are no _node string pool entries
> -there is a duplicate in the string pool, but as I put out in my
> original email the duplicate string pool entry is near the bottom
> of the
> backup file, included with the node IDs of 6427720.
>
>
> I'm still concerned we have a duplicate in the string pool entries and
> not all of the strings in the string pool are making it out to the
> backup file.
So am I, believe me.
Andrae.
--
Andrae Muys
andrae at netymon.com
Senior RDF/SemanticWeb Consultant
Netymon Pty Ltd
_______________________________________________
Mulgara-dev mailing list
Mulgara-dev at mulgara.org
http://mulgara.org/mailman/listinfo/mulgara-dev
More information about the Mulgara-dev
mailing list