[Mulgara-general] Mulgara Lite

Paul Gearon gearon at ieee.org
Mon Sep 8 13:54:48 UTC 2008


On Mon, Sep 8, 2008 at 4:38 AM, Edwin Shin <eddie at fedora-commons.org> wrote:
> I just committed a tiny tweak to the build file which was causing the
> core-dist target to fail. I assumed we didn't need the descriptor.jar
> contents in core-dist, else the target needs to include that as a
> dependency.
>
> A couple questions about third party library changes since 2.0.0
>  1) What is the core-3.1.1.jar needed for?

I saw that. Thanks.

It seems strange, because a clean checkout got all the way through.

>  2) Why did JRDF get integrated into the Mulgara source tree?
> To some degree, this strikes me as counter to the motivation for
> core-dist: "a distribution that contains ONLY Mulgara code (no jars
> embedded)". If Mulgara needs to maintain its own fork of JRDF, would
> repackaging it somewhere under org.mulgara be a better move (so that
> other code can still bundle Mulgara and JRDF)?

Ah, this is a long story.

Early on, Mulgara wanted to represent nodes, with subtypes of Literal,
BlankNode, and URI, and so on. Andrew (who was on the project) had
recently defined a set of interfaces called JRDF that did just this,
so the decision was made to use his interfaces. They quickly became an
integral part of the system, being used *everywhere*. Andrew wrote a
lot of the basic, common functionality for JRDF objects into abstract
classes, which Mulgara chose to extend. As a result, Mulgara is
heavily dependent on the way that JRDF is structured.

Several years later I was thinking about indexing again, and in order
to prove some ideas, I wrote an in-memory triple-store. The easiest
interface I could think of was JRDF, and so I used it. Once it was
going, I asked Andrew if he'd like it as an "example implementation"
of the interfaces, and to put it into the JRDF project on SourceForge.
This led to JRDF expanding on this capability, eventually expanding up
to having a full SPARQL implementation.

The problem is that none of this new stuff was desirable for Mulgara,
and when it came time to upgrade the JRDF library we found that we
were incompatible (re-integration was both excessive and unnecessary).
This led to us needing to maintain a fork.... something I really
didn't want to do. On top of that, bugs in the JRDF code were
frustrating to track down, since the code was in a separate code base.

However, the most important problem was that we had no control over
the JRDF interfaces. If we wanted Nodes to all respond to particular
requests, we couldn't do it. As a result, Mulgara code is littered
with methods that say things like:
  if (node instanceof URIReference) {
    doSomething((URIReference)node);
  } else if (node instanceof BlankNode) {
    doSomethingElse((BlankNode)node);
  } else {
    doSomethingForLiterals((Literal)node);
  }
Not being able to update Node also means that we can't treat the
various types that can be selected in a query (nodes, variables and
subqueries) as a single type, and instead we pass this sort of thing
around as Object.  The list goes on.

As for repackaging JRDF under org.mulgara.... Yes, I would love to do
this. However, I am quite wary of it, as the JRDF interfaces form part
of our public API. I believe that making this change would break a lot
of the code that talks to Mulgara. If there is some way to make this
happen without hurting all our users, then I'm all for it, but until
this is addressed I'll leave it "as is" and hope that no one tries to
use a JRDF jar in the same process space (a much less likely
possibility).

Does that make sense?

Paul



More information about the Mulgara-general mailing list