[Mulgara-dev] sparql web app config

Paul Gearon gearon at ieee.org
Wed Dec 29 23:33:49 UTC 2010


On Mon, Dec 27, 2010 at 12:13 PM, Gregg Reynolds <dev at mobileink.com> wrote:
> On Thu, Dec 23, 2010 at 1:41 PM, Paul Gearon <gearon at ieee.org> wrote:
>>
>> Hi Gregg,
>> > While I'm at it:  the config stuff in Mulgara is pretty complicated.
>> >  Castor
>> > (and XML in general for simple tasks) seems like overkill;
>>
>> I disagree with this. XML was designed for all levels of
>> configuration.
>
> OK.  Matter of taste, I suppose, and if it ain't broke don't fix it.

That's my usual approach to problems.  :-)

There's a lot of things to get done, so cleaning up things that work
doesn't figure in my list of priorities.

>  Nevertheless, for the sake of clarity I'll explain my thinking.
> Minor point:  XML is certainly used for all kinds of data, but it was
> designed for documents, to address the shortcomings of SGML.  It was only
> when its superiority for that purpose became clear that people realized it
> could be used for basically any data and started using it for all kinds of
> purposes, often inappropriately.  Backlash started emerging a few years ago,
> even for documents; witness the popularity of various plaintext formats like
> markdown, yaml, etc.
> Whether XML is good for basic application config files is debatable;

This stuff all came about when XML was the trendy thing to be using.
It certainly does the job, and in this case it's abstracted away
(since all the parsing is done for you).

> where
> it is definitely abused (IMO) is as a
> scripting-language-with-pointy-brackets.  The success of XSL
> notwithstanding.

You really think that XSL was that successful? I know it's out there,
and a number of systems were built that use it, but I've never run
across one myself. To me it feels like talking about the success of
C++. Once upon a time it *was* a big deal, but these days? :-)

>  Jetty provides a good example.  The xml config language is
> essentially a java scripting language with XML syntax.  It would be far
> better to either use a real scripting language or design a human-friendly
> DSL using e.g. Antlr.

Perhaps, but that would take coding, which in turn takes time. Unless
it's a particular pain point for someone, then I can't see it
happening.

>> Also, it's not like we had to bring an XML parser in
>> just for configuration, since it's already needed for several other
>> things (most notably, RDF/XML).
>
> Doesn't castor involve schema validation too?  YMMV, but for my money XML
> Schema validation for simple configuration data is definitely overkill.

Yes, but I think it's mitigated. For a start, if a config fails
validation, then it tells you what you got wrong, which is useful.
Secondly, the overhead in validating is insignificant wrt the rest of
the system, so it's not noticeable. Finally, it's something provided
for free by Casto.

>> > wouldn't it be
>> > simpler to just use simple property files?  Or something specialized for
>> > configuration, like JFig?
>>
>> I believe Castor was chosen to avoid having to deal with XML parsing.
>> I think someone might have liked the idea of multiple storage options
>> as well (not just XML).
>>
>> Personally, I don't mind it. It shows up as a class with lots of
>> properties, each of which indicates a configuration option. It's quite
>> easy to use, and no code was needed to set it up. Note that by using
>> XML it is possible to create nested structure in the XML. This isn't
>> used extensively, but it IS used, and it would be hard to configure
>> with a simpler config system (e.g. properties files). For instance,
>> the various factories can all have properties associated with them.
>> This isn't used heavily in the default configuration (there's just a
>> "dir" attribute), but some factories allow for a lot more
>> configuration to go on.
>
> Log4j is a good example of how hierarchy can be supported in simple property
> files.  A bigger problem is where we have multiple entries with the same
> tag, e.g. ContentHandler and ResolverFactory in the mulgara config file.
>  But I guess that could be handled by writing a comma-separated value list
> for one property name.
> I see two drawbacks to using castor for XML config stuff.  One is the
> learning curve.  It may be easy to use once it's all set up, but if I want
> to enhance the config stuff I have to learn not only castor but XSD, instead
> of just adding a property.

Actually, I never bothered to learn Castor, nor did I learn XSD. In
the past, when I wanted to modify the config file, I've just edited
the existing XSD. That file is quite simple to read and modify. There
just wasn't any point in me learning the whole XSD system. (That said,
I *did* once read up on DTDs one day in about 2002, but I've forgotten
most of it).

Maybe there are things that I could/should be doing in the config that
I'm unaware of because I don't know XSD, but so far it doesn't seem to
have been an impediment for me. So I dispute there being a learning
curve, since I never learnt any of that stuff and I've never had a
problem with it. :-)

> The major drawback is not XML but that castor isn't designed for
> configuration.

Perhaps not, but XML was (OK, I appreciate your argument from earlier,
but XML is certainly used for configuration, and it's promoted in that
way). Castor was just chosen as a tool to avoid reading the XML. I
have to say, if we were using a DOM or stream parser, then every new
configuration item would be far more painful to deal with. Adding
something into Castor means adding it to the XSD (you know... that
format that I never bothered to learn) and then you're done!

> What makes packages like JFig attractive is built-in support
> for hierarchies of config files.  It would be nice, especially during
> development, to have something like ./mulgararc > ~/.mulgararc >
> /usr/local/etc/mulgararc > MULGARA_HOME/mulgararc
> By the way, don't get me wrong, none of this my postings are intended as
> "Paul's Tasklist".  It's mostly stuff that I'm thinking of doing just for
> fun.

That's OK. While I'm the main person at the moment, it doesn't mean
that it's all about me. I'd rather have someone engaged with the
project who disagreed with me on every point than no one caring at
all.

Paul


More information about the Mulgara-dev mailing list