[Mulgara-general] firefox bug and server config

Paul Gearon gearon at ieee.org
Wed Sep 9 16:19:13 UTC 2009


On Wed, Sep 9, 2009 at 11:07 AM, Gregg Reynolds <dev at mobileink.com> wrote:
>
> On Wed, Sep 9, 2009 at 8:18 AM, Paul Gearon <gearon at ieee.org> wrote:
>>
>> >>
>> > SPARQL's "ORDER BY" clause, using "<"; for semantics the SPARQL
>> > definition
>> > refers to XQuery/XPath function "fn:compare", which supports multiple
>> > collations.
>>
>> This function supports multiple collations, true, but if you look
>> again you'll notice that SPARQL does not support it at all.
>
> Hmm.  My reading of SPARQL is that it does require it; what am I missing?

Sorry, I meant to say "support multiple collations".

> Section 9.1:  "The "<" operator (see the Operator Mapping and 11.3.1
> Operator Extensibility) defines the relative order of pairs of numerics,
> simple literals, xsd:strings, xsd:booleans and xsd:dateTimes."
>
> Section 11.13 says that "A < B" for xsd:strings means
> "op:numeric-equal(fn:compare(STR(A), STR(B)), -1)"
>
> Ok, I see, you're saying that (from XQuery/XPath 7.3.2):
>
> fn:compare($comparand1 as xs:string?,
> $comparand2 as xs:string?) as xs:integer?
>
> is required but
>
> fn:compare( $comparand1  as xs:string?,
> $comparand2  as xs:string?,
> $collation  as xs:string) as xs:integer?
>
> is optional, correct?  Ok, but on the other hand, SPARQL says "The collation
> for fn:compare is defined by XPath", which in turn says, well, a bunch of
> stuff, which could be read to mean fn:compare must(?) support both
> signatures.

SPARQL is referring to the default collation. Notice that ORDER BY is
defined in respect to the less-than (<) operator, which in turn is
defined with fn:compare. However, < is binary only, meaning that it is
not possible to choose a collation (as this would be a third
parameter). That's why < is defined as:
  op:numeric-equal(fn:compare(A, B), -1)

> To me it looks like the SPARQL definition is buggy on this point.  I guess
> I'll submit an issue to the SPARQL2 folks.

I'm not sure what we (the SPARQL working group) would be looking at
here. ORDER BY simply needs to define an order, and this is provided
(it is defined as whatever fn:compare gives us). Perhaps a new feature
could be added to ORDER BY to change the ordering, but new features
have already been determined for SPARQL2.

>> I *did* notice that SPARQL leaves comparison between language tagged
>> literals as "undefined", meaning that it *is* possible to use
>
> Oh, yuck!  I mean I can see leaving the ordering of "church"@en and
> "church"@es (Spanish?) undefined, but if they both have the same language
> tag?  I guess the reasoning must be that one doesn't know if a literal is
> supposed to represent a string or not, but it seems a little iffy.  Why
> support a language tag at all then?  This looks like a major boo-boo; it
> would mean one would have to always markup strings as rdf:XMLLiteral and
> include an xml:lang attribute.  I think.

No, the reasoning was to not enforce that this had to be done, since
it was beyond the scope of many implementers. By leaving it
"undefined" in the spec, individual implementers can choose how they
want to handle this. You already mentioned this morning that it is not
a trivial undertaking.

>> collations here. So for instance, 'Strasse'@de and 'Straße'@de could
>> compare equal, while 'Strasse' and 'Straße' differ.
>
> Huge problem with SPARQL, imho.  Different implementations are apparently
> free to order such strings according to whim, which means ORDER BY is
> useless, since the client will have to sort anyway.  I think.

I'm sure that any implementation handling certain languages would do
so appropriately. But for English-only implementations, then you'll
probably find that it just uses standard anglo ordering regardless of
language.

For the moment, Mulgara orders first by language code (if one exists)
and then by unicode character. So Mulgara really doesn't know how to
handle ss vs. ß, since it would just compare the ß to a single "s".

>> scope of anyone working on Mulgara at the moment. It wouldn't be so
>> hard, if only I could find a library that would implement fn:compare.
>
> I'm guessing that ICU is a pretty good bet, unless fn:compare collation
> semantics deviate from Unicode.

They shouldn't. The W3C groups are supposed to keep their standards
compatible for just this sort of reason (things *can* slip through,
but it's part of the task of each working group to make sure that they
don't).

>> Incidentally, this is a general request for anyone reading this.....
>> if you know of anything that can do the range of XQuery/XPath
>> functions then I'd love to hear from you. Specifically, I'm after
>> something that provides an javax.xml.xpath.XPathFunctionResolver. At
>> this point I'm even prepared to consider creating a plugin framework
>> for commercial libraries. What I *can* tell you is that Saxon and
>> Apache don't provide them. EXSLT has a lot of functions, but only
>
> See also XML (XQuery) database implementations like Xindice and eXist.

I'll check them out. Thanks.

Paul



More information about the Mulgara-general mailing list