[Mulgara-dev] XA1 and the String Pool

Pradeep Krishnan pradeepk at soft-point.com
Wed Jul 30 07:03:00 UTC 2008


Hi Paul,

Paul Gearon wrote:
> I've recently been asked to have a look at any quick fixes that might
> improve the performance in XA1.  While we are all looking forward to
> XA2, this is still some time away, and anything that can help in the
> meantime would be well received. Of course, with XA2 coming we don't
> want to spend too much time on it either.
> 
> One of the things that came up is the string pool.
> 
> Early this year I found out that we are not dropping anything from the
> string pool at all. Apparently this operation was disabled due to a
> subtle race that was never pinned down. Re-enabling delete operations
> appears to work just fine, but I've been told by reliable sources that
> the bug is there, though it rarely ever occurs.
> 
> With XA2 on the way, it doesn't make sense to try to find this problem
> in the string pool (I don't even know how it is supposed to manifest).
> As a result, the string pool is being left as a write-once-read-many
> (WORM) store. This isn't such a bad thing, since most of the time data
> is not removed from it, even when statements are removed. The one time
> that we want to remove a lot of data from the string pool is when a
> graph has been dropped and is about to be re-loaded with something
> similar. In that case we really DON'T want the string pool dumping
> everything, only to re-load it again immediately. In an ideal world,
> we would have an idle task to clean up unused entries, with an admin
> tool to force the cleanup if we want the space reclaimed immediately.
> 
> Now that I've given my case for why the string pool has to be WORM, we
> should take advantage of this. The current design is built for regular
> read/write operations (which is not happening here). This read/write
> design uses a LOT of files with various regularly sized blocks, and a
> lot of accounting (free lists) for managing the use of those blocks.
> None of this is needed anymore for a WORM system.
> 

[snip]

> I don't anticipate much improvement in reading speed here, but writes
> should end up performing much better.
> 

+1 - that would be wonderful.

> There are some technical issues to be dealt with, of course. For
> instance, if we roll back a transaction, any new entries in the string
> pool can be unwound simply by truncating the "packed" data file at the
> position of the last commit, and not updating the counter associated
> with the node pool. But most of our the operations (and all of the
> interfaces) will remain unchanged.
>

Does this unwind work across a restart after a system crash? ie. when a 
crash occurs in the middle of a transaction that added entries to the 
string pool.

Also what is the downside of not unwinding on a rollback? Shouldn't the 
same reasons that support not deleting string pool entries hold here too?

Cheers,
Pradeep



More information about the Mulgara-dev mailing list