[Mulgara-general] My job, and SPARQL development

Life is hard, and then you die ronald at innovation.ch
Wed Feb 6 04:37:28 UTC 2008


On Tue, Feb 05, 2008 at 08:14:44PM -0600, Paul Gearon wrote:
> 
> On Feb 5, 2008, at 6:46 PM, Life is hard, and then you die wrote:
[snip]
> > Another problem is the assumption in various places of only a single
> > mulgara instance. If you think of libraries embedding mulgara for
> > internal use then it would be really nice if multiple instances could
> > be running in the same JVM and classloader (separate classloaders can
> > already be made to work). I've found two main places that need fixing
> > here: the above mentioned session-factory stuff, but I've also found
> > the use of static fields (singletons) in things like the
> > XANodePoolFactory which end up preventing one from being able to run
> > multiple instances.
> 
> This is because you aren't supposed to run more than one database in a  
> single system.  It is guaranteed to be slower.  This is one of the  
> reasons we allowed for multiple graphs.  If you really need it, then  
> you can run multiple servers on the one system.
> 
> The only advantage I can think of for having multiple databases in one  
> JVM is to permit multiple writers while avoiding IPC.  In reality, for  
> every reason you may have to put multiple databases in a single JVM,  
> there are better reasons to not do it.

While I agree that a single server will perform better, in practise
you run into things like two libraries using the same 3rd library and
you ending up with conflicts. I'm just running into this now where
we're using JOTM and so is Mulgara (and I'm running Mulgara embedded,
i.e. in the same JVM and classloader), and bam, things are totally
messed up because somewhere they use a static field to hold the
"current" transaction for each thread. This is enormously frustrating
as a developer. And I've run into this sort of thing many times
before (both as a user and writer of libs). As soon as you allow
mulgara to be used embedded, i.e. in the same JVM and classloader, it
_must_ be able to run multiple instances IMNSHO. 

Also think of testing. E.g. if you want to test a distributed mulgara
then it's perfectly reasonable to fire up multiple embedded instances
- much easier than having to start external server processes (and btw,
I believe the current mulgara tests would work better if they fired up
an embedded instance rather than a separate server).

> In terms of cleanliness of design, then you may be right.  However,  
> things like the NodePool will definitely perform better if they are  
> singletons.

Forget performance here. If you're running multiple instances it's
obvious that you're wasting memory etc by running multiple instances,
so I wouldn't worry about it. When you need performance then you'll
obviously make sure you're only running one instance per machine; I
see the use cases for embedded instances in testing, in small (quite
possibly completely hidden) databases inside libs and apps, and in
quick-install scenarios for apps.

> > We've also tried creating in-memory embedded mulgara instances, but a
> > bunch of things are not implemented there so we've found this not to
> > be usable (e.g. searching for typed-literals is not implemented in the
> > memory-based string-pool, leading to all sorts of things bailing out).
> 
> Fair enough.  That should be trivial to implement though.  Do you want  
> it?

:-) For in my copious spare time? I think I have a couple other things
with higher priority right now. I'll create a ticket for it however.


  Cheers,

  Ronald




More information about the Mulgara-general mailing list