[Mulgara-general] Swap space exhaustion in 2.1.4

Paul Gearon gearon at ieee.org
Mon Feb 15 18:08:44 UTC 2010


Hi Benjamin,

Sorry to take so long to get back to you on this, but I've finally
gonge through it in detail...

On Fri, Jan 22, 2010 at 4:41 PM, Benjamin Armintor <armintor at gmail.com> wrote:
> Hello Mulgara folks,
>
> Recently I noticed some queries against Mulgara 2.1.4 with fairly
> large result sets against (probably more significantly) predicates
> with many objects (dateTime stamps) were failing for lack of memory.
> I put together an hprof harness for the queries to try and diagnose
> the problem, and it failed (although with some hints).  I think the
> issue was the creation of many, many Block objects backed by direct
> byte buffers in BlockCacheLine.find().  The directly allocated buffers
> there weren't cleaned up with a Cleaner, and I think the recursive
> calls might also inhibit their beng gc'd.
>
> I rebuilt Mulgara with HeapByteBuffer-backed Blocks, and added a
> method for re-using Blocks in a BlockFile to cut down on the creation.
>  This appears to have fixed the problem, and hprof is showiing a
> fairly significant reduction in memory usage for the queries I
> mentioned.
>
> I'm attaching a patch, recognizing that it's out of date.  If it's
> interesting to the developers, I'll try to port the changes to the
> 2.1.7 branch.

Actually, it was able to go in without modification (this is a
reasonably stable area of the code).

I've made a couple of changes to your patch.

First, I created a system property flag to indicate if standard Java
heap should be used, or if it should continue to work with direct
buffers. Any change that has a major effect throughout the system
needs to be trialled a bit before we make it the default, so this how
I'll be doing the transition. For the moment it still defaults to the
original way of using direct buffers, but changing this default in the
future will be trivial. The flag in question is mulgara.xa.memoryType,
and it can be set to either "direct" or "heap".

Second, I decided to stick to the original design in
BlockCacheLine.findBlock, where the highBound and lowBound values are
passed in, rather than being evaluated each time. This isn't a huge
saving (only log(N) function calls per invocation), and under normal
circumstances it would be premature optimization. However, because it
is right at the bottom of almost every loop in the system, it will
have a (slightly) more noticeable effect. Also, Andrae went to the
effort to do it this way in the first place, so I figure it doesn't
hurt to leave it there.

I've promised a release today (probably tonight, given everything I
have going right now), so it will be in there.

Regards,
Paul Gearon



More information about the Mulgara-general mailing list