[Mulgara-dev] File based resolver

thomas thomas at stray.net
Thu Jun 21 10:20:10 UTC 2007


i thought about that too, but - i only know of one webdav-server that's 
open source and implemented in java and that's jakarta.apache.org/slide. 
slide seems to be pretty dormant, last release in 2004. and it doesn't 
implement precisely that part of the webdav specification that would give 
note to other applications if some file had changed. but that would be 
crucial to keep the store informed if other apps change the files (that has 
to be possible). i don't know how hard that is to implement but at the 
moment i'd rather try to implement some webdav-functionality within or on 
top of mulgara, since all the metadata that i need is already within 
mulgara and i "only" have to make it do something.

but i would be glad to be proven wrong. have you combined a rdf-store with 
a webdav server? could you give some detail, how you did it?

ciao
thomas


--On 21. Juni 2007 10:49:06 +0200 Leo Sauermann <leo.sauermann at dfki.de> 
wrote:

> I didn't follow the whole discussion, but from my experience I would say:
> putting files into an RDF store is ugly and will cause trouble.
>
> rather combine a webdav with the rdf store, I would say thats the
> cleanest solution.
> (really, only webdav is. http, http-put, http-get, ids, metadata, its
> clean).
>
> best
> Leo
>
> It was Andrae Muys who said at the right time 20.06.2007 05:49 the
> following words:
>>
>> On 19/06/2007, at 9:44 PM, thomas wrote:
>>> first i'd like to check how things fit with the RDF model. RDF
>>> accepts as objects
>>>     * literals     (as string, xml, etc)
>>>     * URIs         (URLs and non-URLs)
>>>
>>> these are both very straight object types. the literal is just
>>> itself, it's the content. especially there's no difference between
>>> name and content. you could say, it's both at once. you could even
>>> say, there is no name, just content and you call it by it's content.
>>> the URI is exactly just a name. it represents something (be it a
>>> resource or a concept - doesn't matter here) that's outside the
>>> store. so when you've got an URI in front of you, you know
>>> immediatly: this is just a reference, a name. when you've got a
>>> literal you know: this is the content and that's all there is to it.
>>
>> Two comments (clarifications not corrections).  First, a literal is
>> not a name - it is the content.  URI's are names, Literals are
>> objects, (and Blank-nodes are existentials).  Second, URI's are
>> opaque, and may or may not be dereferencable.
>>
>>> now a file is clearly both, name and content, but unlike the literal
>>> they don't fall into one. and unlike the URI they are not clearly
>>> separated. so files have to be treated differently.
>>> take an image file for example: it has a name (eg "image.png") and by
>>> calling it by it's name you talk about it, transform it, move it
>>> around etc. the actual content, the picture, is a piece of data that
>>> you don't ever want to write out byte by byte. you want it to render
>>> only under some welldefined circumstances. so in the case of a file
>>> name and content are two distinct parts of the same object.
>>
>> Actually no they're not.  A file does not have a name - to assume that
>> is to fall afoul of the unique-name-assumption, which doesn't hold
>> with files.  Not all files have a name (see tmpfile(3)), some files
>> have multiple names (see link(2) and symlink(2)).
>>
>> What we think of as a 'file' doesn't actually exist - it's just a
>> convenient fiction that works 99.9% of the time.  Unfortunately this
>> is the 0.1%.  What we actually have is an inode and 0 or more
>> directory-entries.
>>
>> Each inode is _the_ content.
>> Each dirent is _a_ name.
>>
>>> and it is crucial to always know if you are talking *about* the file
>>> and therefor use it's name to, well, name it, or if you want to get
>>> the file *itself* eg to render it, to transform it etc.
>>
>> Agreed.  RDF uses URIs for names, and URIs can trivially refer to
>> dirent's, so that's a good mapping.  RDF uses Literals for content, so
>> we need to find a way of representing an inode as a Literal.  Here's
>> the trick, the most convenient way of referring to an inode is via an
>> existing dirent.
>>
>> So when referring to the dirent I like <file://dirent>.
>> and when referring to the inode I like "dirent"^^<mulgara:dirent>
>>
>> and in doing so you "always know if you are talking *about* the file
>> and therefor use it's name to, well, name it, or if you want to get
>> the file *itself* eg to render it, to transform it etc."
>>
>>> therefor i suggest to add another protocol-type beside "http" and
>>> "file". let's name it "store". since we don't speak about files in
>>> the filesystem nor about a file on the web but files in the datastore
>>> i think that's justified, intuitive and precise. ("file://" is just
>>> an unappropriate, misleading naming, since it refers to a protocol
>>> but souds like it would refer to an instance-type). and it's the
>>> RDF-way: making the seemingly obvious explicit, so that we can do
>>> something more with it.
>>
>> Always a good idea to avoid requiring URI introspection.  When
>> comparing arbitrary URI's we really do not want to have to examine
>> them to determine if one of them is actually a literal in disguise.
>>
>>> i'm a bit helpless about how to get files in and out the store. while
>>> calling for  <whateverprotocolcomesup://
>>> thesameuniquepathagain/thatcertainfilename.xyz> might be enough to
>>> pull them out, how to put them in in the first place? maybe now a
>>> triple is justified, because we've got three act/or/s:
>>
>> My preference would be to follow the same approach we take for
>> existing rdf nodes.  The node exists if it is referred to, and ceases
>> to exist when we no longer refer to it.  That takes care of
>> create/delete.
>>
>> Update is a misnomer - RDF doesn't have update.  If you
>> change/transform the content in anyway, what you get is not an
>> 'updated' literal, but a completely different, and independent
>> literal.  If you want the ability to refer to some update-transparent
>> concept of 'updatable-file', you use a _name_ and specify that the
>> name now refers to the new literal.
>>
>> <fileURI> <ns:refersTo> [ <ns:content> "dirent"^^<ns:dirent> :
>> <ns:revision> "..timestamp.."^^<xsd:Timestamp> ]
>>
>> which when file at timestamp->file'@timestamp' becomes
>>
>> { <fileURI> <ns:refersTo> [ <ns:content> "dirent"^^<ns:dirent> :
>> <ns:revision> "..timestamp.."^^<xsd:Timestamp> ]
>>           : <ns:refersTo> [ <ns:content> "dirent'"^^<ns:dirent> :
>> <ns:revision> "..timestamp'.."^^<xsd:Timestamp> ] }
>>
>> delete the obsoleted entry if you don't need it anymore and want
>> mulgara to be able to reclaim the disk space.
>>
>> Andrae
>>
>> --Andrae Muys
>> andrae at netymon.com
>> Principal Mulgara Consultant
>> Netymon Pty Ltd
>>
>>
>> _______________________________________________
>> Mulgara-dev mailing list
>> Mulgara-dev at mulgara.org
>> http://mulgara.org/mailman/listinfo/mulgara-dev



mailto:thomas at stray.net
http://stray.net




: accumulated wisdom
. early optimization is the root of many evil [donald e. knuth]
. if you've got a hammer every problem looks like a nail
. the difference between theory and practice is always greater
  in practice than it is in theory



More information about the Mulgara-dev mailing list