[Mulgara-dev] random heap-space exhaustions

Paul Gearon gearon at ieee.org
Tue Feb 6 19:15:25 UTC 2007


On Feb 5, 2007, at 11:20 PM, Life is hard, and then you die wrote:

> On Mon, Feb 05, 2007 at 10:07:14PM -0600, Paul Gearon wrote:
>> I haven't seen this either, so I'm stabbing in the dark.
>>
>> I have a couple of questions.  Since it's always the same queries  
>> which
>> fail, what's their basic structure?
>
>   select $p $o from <model1> where
>   ( $s $p $o in <model2> and
>     ( $s <mulgara:is> <url1> or $s <mulgara:is> <url2> or $s  
> <pred1> <url1> or
>       ( trans($s <pred1> $res) and $res <mulgara:is> <url1> )))
>    or
>   ( $s $impliedBy $o in <model2> and
>     ( $impliedBy <pred2> $p or trans($impliedBy <pred2> $p)) and
>     ( $s <mulgara:is> <url1> or $s <mulgara:is> <url2> or $s  
> <pred1> <url1> or
>       ( trans($s <pred1> $res) and $res <mulgara:is> <url1> )))
>
> (the duplication the last two lines is because of bug MGR-36).

It's been in the back of my mind for the last couple of years that  
trans() uses a java.util.Set to remember where it's been so it  
doesn't get caught in loops.  This is inherently memory limited.  I  
know how to fix it (once you exceed a high-water mark you move to an  
on-disk hashset), but I've avoided this effort since it never bit  
anyone.  Maybe my laziness is finally coming home to roost?

Do you know if you have a lot of linkages with <pred1> in your  
system?  Would I be able to fit that many Longs into a hash set?

Trans is pretty efficient (if I say so myself), but it still needs to  
remember where it's been, or else it would loop infinitely.

>> Also, can we get a stack trace of the *first* of your failures  
>> please?  It
>> probably won't help a lot, but you never know.
>
> Nope, OOM stack traces are rarely useful, and this ones no exception.

I know, but it's worth a shot...

> But here it is anyway:
>
>   org.mulgara.itql.ItqlInterpreterException:  
> org.mulgara.query.QueryException: Error ending previous query
<snip/>
>   Caused by: org.mulgara.query.QueryException: Error ending  
> previous query
<snip/>
>   Caused by: javax.transaction.RollbackException
>           at org.objectweb.jotm.TransactionImpl.commit 
> (TransactionImpl.java:225)
>           at org.objectweb.jotm.Current.commit(Current.java:442)
>           at  
> org.mulgara.resolver.DatabaseSession.endTransactionalBlock 
> (DatabaseSession.java:1026)
>           ... 43 more
>   Caused by: java.lang.OutOfMemoryError: Java heap space

None of the things above the RollbackExecption are interesting.   
Unfortunately, it's the bottom of that stack of "43 more" that may  
have provided something.

Sigh.

Looking at the trace a little more carefully, it looks like this was  
printed on the client side.  Right?  Do you have a trace from the  
server?  If it's a server exception, then the client never provides  
as much detail as the server does.  I'm just looking to see if the  
OOM happens somewhere inside the trans.

>> One last thing, and this may be difficult...  When you see one of  
>> these
>> queries taking >20 seconds, would someone be able to hit Ctrl-\ on  
>> the
>> server?  If we can get an idea of what the program is spending  
>> time doing,
>> then it may help.
>
> I would've done so if I could. But the problem disappears too quickly
> again. Though I might try and see if we can rig up something to
> monitor the process's rss and trigger send a few sigquit's if it goes
> above some threshhold.

If you have 20 seconds to detect this, then that sounds good.

Thanks,
Paul



More information about the Mulgara-dev mailing list