[Mulgara-dev] File based resolver

thomas thomas at stray.net
Fri Jun 29 15:55:11 UTC 2007


I just re-checked Slide and realized that there is still development taking 
place and the mailinglists active and a 2.2pre release in the making. I 
also realized that JCR Jackrabbit <http://jackrabbit.apache.org/>, an 
implementation of JSR 170, is supporting WebDAV as well, and apparently in 
a more standardsconformant way than Slide does (and with more community 
backing either). And than i stumbled upon the following mail 
<http://mail-archives.apache.org/mod_mbox/jackrabbit-dev/200705.mbox/%3c510143ac0705080301vb7b1a09obdc703c0eba6a79@mail.gmail.com%3e> 
which might be interesting in that it says:

<snip>
* Many people are interested in using JCR for storing and querying RDF
triples. This is something we may want to look into in terms of
performance and content modelling. One approach could be to define a
RDF mixin type with an rdf:about property that could be used to turn a
node into an RDF resource. Property names would be the predicates and
property values the objects. This way you could easily make your
existing content model available as RDF. There was interest in
implementing a Jackrabbit binding for the ARQ SPARQL query engine
(http://jena.sourceforge.net/ARQ/). Something like that could be used
as the "SPARQL" query language in JCR.
</snip>

And that, by the way, reminded me of another question that occured to me 
the other day: why isn't mulgara an Apache project? Wouldn't that be a win 
for both sides, increasing mulgaras visibility and adding a RDF store to 
the Apache stack? Just a question, and not a somehow politically sensitive 
one i hope.

ciao
thomas





--On 21. Juni 2007 12:20:10 +0200 thomas <thomas at stray.net> wrote:

> i thought about that too, but - i only know of one webdav-server that's
> open source and implemented in java and that's jakarta.apache.org/slide.
> slide seems to be pretty dormant, last release in 2004. and it doesn't
> implement precisely that part of the webdav specification that would give
> note to other applications if some file had changed. but that would be
> crucial to keep the store informed if other apps change the files (that
> has to be possible). i don't know how hard that is to implement but at
> the moment i'd rather try to implement some webdav-functionality within
> or on top of mulgara, since all the metadata that i need is already
> within mulgara and i "only" have to make it do something.
>
> but i would be glad to be proven wrong. have you combined a rdf-store
> with a webdav server? could you give some detail, how you did it?
>
> ciao
> thomas
>
>
> --On 21. Juni 2007 10:49:06 +0200 Leo Sauermann <leo.sauermann at dfki.de>
> wrote:
>
>> I didn't follow the whole discussion, but from my experience I would say:
>> putting files into an RDF store is ugly and will cause trouble.
>>
>> rather combine a webdav with the rdf store, I would say thats the
>> cleanest solution.
>> (really, only webdav is. http, http-put, http-get, ids, metadata, its
>> clean).
>>
>> best
>> Leo
>>
>> It was Andrae Muys who said at the right time 20.06.2007 05:49 the
>> following words:
>>>
>>> On 19/06/2007, at 9:44 PM, thomas wrote:
>>>> first i'd like to check how things fit with the RDF model. RDF
>>>> accepts as objects
>>>>     * literals     (as string, xml, etc)
>>>>     * URIs         (URLs and non-URLs)
>>>>
>>>> these are both very straight object types. the literal is just
>>>> itself, it's the content. especially there's no difference between
>>>> name and content. you could say, it's both at once. you could even
>>>> say, there is no name, just content and you call it by it's content.
>>>> the URI is exactly just a name. it represents something (be it a
>>>> resource or a concept - doesn't matter here) that's outside the
>>>> store. so when you've got an URI in front of you, you know
>>>> immediatly: this is just a reference, a name. when you've got a
>>>> literal you know: this is the content and that's all there is to it.
>>>
>>> Two comments (clarifications not corrections).  First, a literal is
>>> not a name - it is the content.  URI's are names, Literals are
>>> objects, (and Blank-nodes are existentials).  Second, URI's are
>>> opaque, and may or may not be dereferencable.
>>>
>>>> now a file is clearly both, name and content, but unlike the literal
>>>> they don't fall into one. and unlike the URI they are not clearly
>>>> separated. so files have to be treated differently.
>>>> take an image file for example: it has a name (eg "image.png") and by
>>>> calling it by it's name you talk about it, transform it, move it
>>>> around etc. the actual content, the picture, is a piece of data that
>>>> you don't ever want to write out byte by byte. you want it to render
>>>> only under some welldefined circumstances. so in the case of a file
>>>> name and content are two distinct parts of the same object.
>>>
>>> Actually no they're not.  A file does not have a name - to assume that
>>> is to fall afoul of the unique-name-assumption, which doesn't hold
>>> with files.  Not all files have a name (see tmpfile(3)), some files
>>> have multiple names (see link(2) and symlink(2)).
>>>
>>> What we think of as a 'file' doesn't actually exist - it's just a
>>> convenient fiction that works 99.9% of the time.  Unfortunately this
>>> is the 0.1%.  What we actually have is an inode and 0 or more
>>> directory-entries.
>>>
>>> Each inode is _the_ content.
>>> Each dirent is _a_ name.
>>>
>>>> and it is crucial to always know if you are talking *about* the file
>>>> and therefor use it's name to, well, name it, or if you want to get
>>>> the file *itself* eg to render it, to transform it etc.
>>>
>>> Agreed.  RDF uses URIs for names, and URIs can trivially refer to
>>> dirent's, so that's a good mapping.  RDF uses Literals for content, so
>>> we need to find a way of representing an inode as a Literal.  Here's
>>> the trick, the most convenient way of referring to an inode is via an
>>> existing dirent.
>>>
>>> So when referring to the dirent I like <file://dirent>.
>>> and when referring to the inode I like "dirent"^^<mulgara:dirent>
>>>
>>> and in doing so you "always know if you are talking *about* the file
>>> and therefor use it's name to, well, name it, or if you want to get
>>> the file *itself* eg to render it, to transform it etc."
>>>
>>>> therefor i suggest to add another protocol-type beside "http" and
>>>> "file". let's name it "store". since we don't speak about files in
>>>> the filesystem nor about a file on the web but files in the datastore
>>>> i think that's justified, intuitive and precise. ("file://" is just
>>>> an unappropriate, misleading naming, since it refers to a protocol
>>>> but souds like it would refer to an instance-type). and it's the
>>>> RDF-way: making the seemingly obvious explicit, so that we can do
>>>> something more with it.
>>>
>>> Always a good idea to avoid requiring URI introspection.  When
>>> comparing arbitrary URI's we really do not want to have to examine
>>> them to determine if one of them is actually a literal in disguise.
>>>
>>>> i'm a bit helpless about how to get files in and out the store. while
>>>> calling for  <whateverprotocolcomesup://
>>>> thesameuniquepathagain/thatcertainfilename.xyz> might be enough to
>>>> pull them out, how to put them in in the first place? maybe now a
>>>> triple is justified, because we've got three act/or/s:
>>>
>>> My preference would be to follow the same approach we take for
>>> existing rdf nodes.  The node exists if it is referred to, and ceases
>>> to exist when we no longer refer to it.  That takes care of
>>> create/delete.
>>>
>>> Update is a misnomer - RDF doesn't have update.  If you
>>> change/transform the content in anyway, what you get is not an
>>> 'updated' literal, but a completely different, and independent
>>> literal.  If you want the ability to refer to some update-transparent
>>> concept of 'updatable-file', you use a _name_ and specify that the
>>> name now refers to the new literal.
>>>
>>> <fileURI> <ns:refersTo> [ <ns:content> "dirent"^^<ns:dirent> :
>>> <ns:revision> "..timestamp.."^^<xsd:Timestamp> ]
>>>
>>> which when file at timestamp->file'@timestamp' becomes
>>>
>>> { <fileURI> <ns:refersTo> [ <ns:content> "dirent"^^<ns:dirent> :
>>> <ns:revision> "..timestamp.."^^<xsd:Timestamp> ]
>>>           : <ns:refersTo> [ <ns:content> "dirent'"^^<ns:dirent> :
>>> <ns:revision> "..timestamp'.."^^<xsd:Timestamp> ] }
>>>
>>> delete the obsoleted entry if you don't need it anymore and want
>>> mulgara to be able to reclaim the disk space.
>>>
>>> Andrae
>>>
>>> --Andrae Muys
>>> andrae at netymon.com
>>> Principal Mulgara Consultant
>>> Netymon Pty Ltd
>>>
>>>
>>> _______________________________________________
>>> Mulgara-dev mailing list
>>> Mulgara-dev at mulgara.org
>>> http://mulgara.org/mailman/listinfo/mulgara-dev
>
>
>
> mailto:thomas at stray.net
> http://stray.net
>
>
>
>
> : accumulated wisdom
> . early optimization is the root of many evil [donald e. knuth]
> . if you've got a hammer every problem looks like a nail
> . the difference between theory and practice is always greater
>   in practice than it is in theory
> _______________________________________________
> Mulgara-dev mailing list
> Mulgara-dev at mulgara.org
> http://mulgara.org/mailman/listinfo/mulgara-dev


mailto:thomas at stray.net
http://stray.net




: accumulated wisdom
. early optimization is the root of many evil [donald e. knuth]
. if you've got a hammer every problem looks like a nail
. the difference between theory and practice is always greater
  in practice than it is in theory



More information about the Mulgara-dev mailing list