[Mulgara-general] Is storing images as blobs feasible?

Paul Gearon gearon at ieee.org
Fri Jan 15 02:56:28 UTC 2010


Since this is a slightly different question, then I'll answer again....

On Wed, Jan 13, 2010 at 6:02 AM, David Legg
<david.legg at searchevent.co.uk> wrote:
> Hi,
>
> Would I be totally mad to consider storing binary literals (such as
> images) in Mulgara?  Has anyone tried?

Only in testing, but it worked fine. :-)

> I could just store a URI reference and look up the image from another
> source but it 'feels right' to treat arbitrary binary data in the same
> way you would if it were a string or a date or an integer etc.

That's what base64Binary or hexBinary encodings are for... I think. :-)

> However, from a performance point of view I've no idea how large pieces
> of raw data affect the query speed or memory footprint.

It shouldn't have too large an effect on queries, with a couple of
exceptions. If you have a WHERE clause that says something like:
  WHERE { ?x :foo "deadbeef........"^^xsd:base64Binary }

... then a few things will happen.

First, your entire literal will have to make it through parsing.
JavaCC is fast, but that will be pushing it. This can be avoided by
building queries with the API - this is easy, but not documented. Ask
me how if you like and I'll put the resulting email into the wiki.
:-)

Second, if you have a lot of similar, though different literals, then
the comparisons could take a while. Comparisons will terminate quickly
when they can, but if you have to get through a few MBs before you
find a difference between literals, then query execution will slow
down.

Finally, creating literals this size will take up memory. The ones
already stored will use secondary storage and won't be taking up
memory, but if you put one of these monsters into a query then it has
to use RAM. I could possibly try to find a way to make oversized
transient literals use secondary storage, but the performance impact
would be huge, and it will put a minor cost on the use of every normal
literal as it tests for the difference (though maybe that effect would
be negligible).

So putting these monsters into a query might just be prohibitive, but
queries that look for them and return them should be just fine.

Regards,
Paul



More information about the Mulgara-general mailing list