[Mulgara-dev] Update Patterns

Chris Wilper cwilper at cs.cornell.edu
Wed Dec 13 15:19:11 UTC 2006


Hi Andrae,
 
> What sort of insert/deletes are people doing?
> Are deletes/inserts normally paired in a single transaction?
> How many statements are you insert/deleting in a 
> single update?  Can they be categorised?
> If so what is the frequency of the different categories?

Initially, adding many batches of around fifty to 100 triples at a 
time.  Most (say 75%) of the triples represent literal properties.
Of those, probably a third are datatyped.  The majority
(say 75%) of our datatyped literals are xsd:dateTimes.
As for URIs used in triples, an off the cuff guess is that 75%
of them are distinct.

Update operations are smaller: we usually need to update only
5-20 triples at a time, and accomplish that via a series of deletes 
and adds as a single transaction.

> What is the 'shape' of the data you insert?
>   (ie. many mostly independent sub-graphs describing different 
> instances; or fewer instances with lots of interconnections and 
> object-reference properties?)

Mostly independent sub-graphs, with diameter 2, consisting
of a total of 50-100 triples each.  Note that there are definitely
connections between the sub-graphs, they are just relatively
few.

> Is any significant % of your literals replicated?
> What % of the data are Blank-nodes?

0% are BNodes, thankfully.  We don't do triplestore-to-triplestore
replication right now...but BNodes would appear to complicate the 
problem.

> What is the average length of a URI?

Average?  Probably 60-70 characters.

> What is the average length of a Literal?

About 50 characters, I would guess.

Thanks,
Chris

________________________________

From: mulgara-dev-bounces at mulgara.org on behalf of Andrae Muys
Sent: Wed 12/13/2006 7:40 AM
To: mulgara-dev at mulgara.org
Subject: [Mulgara-dev] Update Patterns




I need some data regarding the sort of updates I need to be keeping 
in mind when making design tradeoffs in XA2.

What sort of insert/deletes are people doing?
Are deletes/inserts normally paired in a single transaction?
How many statements are you insert/deleting in a single update?  Can 
they be categorised?
If so what is the frequency of the different categories?
What is the 'shape' of the data you insert?
   (ie. many mostly independent sub-graphs describing different 
instances; or fewer instances with lots of interconnections and 
object-reference properties?)
What % of data are URI's?
What % of those URI's are distinct?
What % of the data are Literals?
Is any significant % of your literals replicated?
What % of the data are Blank-nodes?
What is the average length of a URI?
What is the average length of a Literal?

I have some ideas, but I need to write some performance tests to test 
the scalability of the various components of XA2 as I build them, and 
the makeup of those tests will be determined by the answers to these 
questions. If I know a particular type/size of update is common, if I 
know certain 'shapes' are common, then I can make sure those shapes 
are simulated in the tests.

Thanks,

Andrae

--
Andrae Muys
andrae at netymon.com
Principal Mulgara Consultant
Netymon Pty Ltd


_______________________________________________
Mulgara-dev mailing list
Mulgara-dev at mulgara.org
http://mulgara.org/mailman/listinfo/mulgara-dev


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mulgara.org/pipermail/mulgara-dev/attachments/20061213/ec0272f7/attachment.htm>


More information about the Mulgara-dev mailing list