[Mulgara-dev] configuration questions

Paul Gearon gearon at ieee.org
Tue Nov 30 22:02:21 UTC 2010


On Tue, Nov 30, 2010 at 10:48 AM, Gregg Reynolds <dev at mobileink.com> wrote:
> Hi list,
> I'm looking for advice on configuring Mulgara for a web application.  It's
> easy to run and access Mulgara, locally or remotely, using cURL; the problem
> is that accessing the server from a web browser runs into cross-site
> scripting problems; neither Chrome nor Firefox will allow access to a
> Mulgara server at a different URL than the one serving the webpage.
> Last year I hacked up the code in HttpServices.java so that I could serve my
> webpage from Mulgara.  With the change to Jetty 7 this doesn't compile, and
> it's not a very good solution anyway.
> This page leads me to believe that Jetty 7 comes with a configurable filter
> to address just this problem, but I'm not sure how to enable it in
> mulgara-x.y.z.jar.  Can anybody offer pointers?

Right now all the Jetty systems are configured programmatically. But
since we want some level of user configurability, we have our own XML
file for setting things up. the hassle there is that it only deals
with those things we have explicitly allowed for.

Obviously it would be great to configure Jetty using a normal Jetty
XML configuration, but until this is done, any new functionality (like
what you've mentioned here) has to have code added to enable it.

> A little background:  I'm working with a linguist whose project is
> reconstruction of Proto-Afroasiatic.  This involves close analysis of the
> phonology and morphology of lots of words across lots of languages - we've
> got data from 50 languages and counting.  RDF is perfect for this, since
> morphosyntactic properties map naturally to triples, e.g. <word:cats>
> <foo:number> <foo:Plural> etc.  So mulgara will be the backend datastore,
> and the frontend will be a web GUI that allows the user to select word
> paradigms, display them in grids, and drag and drop headers, rows, and
> columns to enable easy comparison.  The frontend sends SPARQL queries to
> Mulgara.  For example, "show me the 3rd person present tense singular verb
> forms for these 7 languages".
> The users will not be technically sophisticated, so I need to come up with a
> very simple and clear Mulgara configuration.  Presumably we'll expose a
> server instance on the web, but the browser app will not necessarily be
> served from the same domain.  Plus we will also provide a Mulgara package
> and instructions for running a private local server.  If we can get the data
> schema into shape and properly documented we can expect that people might
> want to write their own browser interfaces.  So we have to deal with the
> cross-origin problem.
> So I guess I have two questions.  One is whether the Jetty cross-origin
> filter is the way to go, and if so how to get it working on Mulgara.

Well, it will only work for Jetty, but if that's the configuration you
want, then sure, it would be OK. However, there are also other
deployment scenarios. Do you know what the equivalent is for Tomcat?

> The
> other is whether I should look into a custom build, e.g. using
> mulgara-x.y.z.war or one of the other build products.  Obviously the focus
> is entirely on Mulgara as a SPARQL endpoint.
> More generally, can anybody provide more details on the hows and whys of the
> various build products?  I don't do a whole lot of Java development -
> actually I don't do a whole lot of development period, except for some web
> stuff -  so it isn't clear to me what all the parts are and how they fit
> together into a Big Picture.   For the documentation I'd like to provide a
> fuller explanation than we have at
> e.g. http://www.mulgara.org/trac/wiki/Deploying.  E.g. "raw-version.jar" is
> "The full Mulgara distribution, but without embedded 3rd party
> libraries";  but what does "full Mulgara distribution" mean?  And what 3rd
> party libs?  Etc.
> Think "block architecture diagrams", as in this diagram of OS X layers.  (If
> anybody wants to take a stab at making such a diagram, check
> out Mockingbird (I have no connection with them, it's just a very handy tool
> for fast diagramming.)

OK, I'll see how for I can get without diagrams....

If you do a simple "./build.sh dist" then everything in Mulgara gets
built, and a set of JAR files and WAR files is created in the "dist"
directory.

The main JAR file (mulgara-2.1.9.jar) contains everything required for
Mulgara. This includes all of our code, plus all of the libraries that
we use. For instance, we use Xerces to parse XML, Apache Commons
Logging to log events, Lucene for text indexing, Jena for parsing
RDF/XML and N3, and so on. If you have this file, then you can run
Mulgara as a standalone server.

The main war file (mulgara-2.1.9.war) contains all of the Mulgara
code, and also contains all of the libraries using the standard
library inclusion technique for WAR files. It is similar to the main
JAR file, but designed for deployment as a web service, e.g. on
Tomcat.

mulgara-raw-2.1.9.jar contains just the code from Mulgara, and none of
the libraries. This allows it to be deployed in an environment where
different versions of some of the libs are already on the classpath.
However, all the required libs will have to be manually included on
the classpath.

mulgara-core-2.1.9.jar is similar to raw, only it also cuts out
anything that isn't essential, such as full-text indexing, or handling
geographic co-ordinates.

mulgara-core-2.1.9.war is like core, only in a web server. This is
particularly important, as many web servers already have several libs
on the classpath. Xerces and a number of the Apache Commons libs are
common examples of this.

The driver-2.1.9.jar file is for client code that wants to talk
directly to a Mulgara server. These days you can use SPARQL, but once
upon a time we only allowed users to connect with a binary API. This
Jar contains the classes to do that. If you talk to Mulgara with HTTP
or SPARQL then you will never need this file.

querylang-2.1.9.jar is the same as driver-2.1.9.jar, only it also
contains all the code needed to parse queries. That allows a client to
parse a query locally, and send a binary request to the server. This
is as opposed to the driver, in that the driver doesn't know how to
parse TQL or SPARQL into a binary structure.

mulgara-lite-2.1.9.jar contains a cut-down version of Mulgara. It
contains everything required to run stand-alone, but removes anything
that isn't required to let it run. So it doesn't have any
non-essential resolvers, in a similar way to the mulgara-core jar.

ideSupport.jar is there for anyone using Eclipse. Some of Mulgara's
code is generated by the build process (e.g. there is code built by
Castor) and Eclipse won't work unless it can find these generated
classes on the path. This jar file is always in the path in Eclipse,
so that the project can be compiled by the IDE.

I think that's it. Some of these jars may not be needed any more, but
they've all been created at various times as people have needed them
for one reason or another. I'd say that the ones most likely to be
used would be:

mulgara-2.1.9.jar - standalone operation
mulgara-2.1.9.war - deployment in a web container
mulgara-core-2.1.9.war - deployment in a web container where the
webmaster wants to control the JARs in the path.

Does that help?

Regards,
Paul Gearon


More information about the Mulgara-dev mailing list