[Mulgara-general] unicode testing

Gregg Reynolds dev at mobileink.com
Fri Sep 18 21:22:44 UTC 2009


FYI I put together a little utility for testing Mulgara's support of unicode
URIs and data.  It's still a little rough around the edges but it seems to
basically work; I'll be polishing it up in coming weeks but if you're
interesting in testing utf8 it should be useful now.

It's in the utf8 subdirectory of
mulligan<http://bitbucket.org/gar/mulligan/overview/>.
In short, running "make" will download some files from the unicode.org
database <http://www.unicode.org/Public/UNIDATA/>, compile unicode.c and
then run it.  It reads the UCD and generates an N3 file for each unicode
block.  A shell script, "load", runs tql commands (using curl) to
drop/create/load the blocks, each to its own graph.  (It's been a long time
since I wrote any code, so don't laugh if you inspect it!)  It's all pretty
simple and self-explanatory so there's no much documentation.

The generated N3 includes, for each char (that is legal in an IRI), an IRI
including the char; for all chars there is a string property including the
char.  I wrote this because I found that I'm still not seeing correct utf8
data in sparql results.  On the TODO list is a task to write a bunch of
sparql queries and a "query" shell script, in order to test the SPARQL
endpoint's utf compliance.

I think this might be useful in putting together a test suite.  The other
stuff in Mulligan is intended to form the basis of a simple tool for getting
started with RDF and Mulgara.  The Ajax webpage is not quite ready, though.

-gregg
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mulgara.org/pipermail/mulgara-general/attachments/20090918/4fe6710e/attachment.htm>


More information about the Mulgara-general mailing list