[Mulgara-general] firefox bug and server config

Gregg Reynolds dev at mobileink.com
Wed Sep 9 11:49:46 UTC 2009


On Wed, Sep 9, 2009 at 12:40 AM, Paul Gearon <gearon at ieee.org> wrote:

> On Tue, Sep 8, 2009 at 10:58 PM, Gregg Reynolds <dev at mobileink.com> wrote:
> > Howdy Paul,
> > Thanks for the detailed response.  I'm pretty booked through next week
> but
> > will then look at config and unicode source.  Meantime I think I'll add
> some
> > stuff to the "Architectural Proposals" section of the wiki - Unicode
> support
> > is pretty important and by no means trivial.  Aside from the three basic
> > encodings (transformation syntaxes, encoding forms, whatEVER they call
> them,
> > I mean utf-x), you've got at least two other major issues if you ever
> want
> > to attract international support, namely date stuff and collations.
>  Pretty
> > essential for SPARQL filters.
>
> My other question is about collations. You're talking about ordering,
> right? We could easily make these pluggable, but I've seen nothing in
> any of the standards to suggest this is needed, or even an option.
>
> So what exactly is your issue here?
>
> SPARQL's "ORDER BY" clause, using "<"; for semantics the SPARQL definition
refers to XQuery/XPath function
"fn:compare<http://www.w3.org/TR/xpath-functions/#func-compare>",
which supports multiple
collations<http://www.w3.org/TR/xpath-functions/#string-compare>.
It's
a major hairball, because you've got to deal with, among other things,
Unicode normal forms.  Then there's the general weirdness, like "ch" is one
"letter" in Spanish, the Germans use more than one sort order, various
languages have more than one "alphabetic" order (e.g. Japanese
Iroha<http://wapedia.mobi/en/Iroha>,
Arabic abjad <http://en.wikipedia.org/wiki/Abjad_numerals>), etc.  The ICU
documentation has some good examples<http://userguide.icu-project.org/collation>
.

It's not a big issue for my project (I don't believe any of the standards
supports traditional Arabic sorting, which is based on root), but your
market expands considerably if you support e.g. East Asian calendars and
collations.

I don't mean to imply that Mulgara is broken when I suggest looking at ICU,
only that Mulgara is an RDF database project, not an i18n project, so I
would not expect it to have implemented the whole range of stuff supported
by a dedicated i18n project like ICU.  It would be a minor miracle if it
did.  Implementing high-quality i18n functionality is a major investment,
and it's already available as open source.  (There may be other, good
reasons not to migrate to ICU, of course, but I think it's worth a look).

Cheers,

gregg
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mulgara.org/pipermail/mulgara-general/attachments/20090909/c59838ee/attachment.htm>


More information about the Mulgara-general mailing list