[Mulgara-general] Transactions

Tue Jun 1 19:16:12 UTC 2010

Thanks much for the prompt reply.

Paul Gearon <gearon at ieee.org> writes:

> On Sun, May 30, 2010 at 5:09 PM, David <dsiegel at acm.org> wrote:

> Well, for the moment Mulgara does not do concurrent updates.

I'm far more concerned about "logical" concurrency, and the effect on
database consistency, than I am about write concurrency. Improving
performance would be great, but consistency's a bedrock requirement.

Say 2 processes, A and B, start at the same time, and each reads from
Mulgara, and then based on what it read, prepares a transaction of
inserts and deletes. Both send their transactions at the same time.
Mulgara will serialize their execution, running either A then B,
or B then A.

Both transactions, though, are based on the same view of the database
(they both did their reads from the same state of the database), and the
transaction that Mulgara executes second has no direct way to ensure
that the assumptions on which it was based, were not invalidated by the
transaction that ran first (in my example, that the first transaction
didn't write a "Joe dateOfBirth ?x" statement).

So, without the kind of workaround I mentioned in my message, there
seems to be no way to guarantee the consistency of the database.

> This interacts with the REST interface because REST requires a group
> of operations to be sent in a *single* request, but we have no
> operations that say things like, "If the result of this query is xxx
> then do yyyy, else do zzzz". The only other way to do things would be
> to start a transaction, do queries, and use the results to determine
> what writes you want to do. Unfortunately, that means that you have to
> send multiple requests, meaning you need to keep the transaction for
> an arbitrary period of time. Since we only support a single write
> transaction at a time, that means that the database would get totally
> locked from all access for some arbitrary period of time. Even if
> that's only a second or two, that's not really acceptable.

Absolutely agreed.

>> I've got an idea, but I think it's blocked by the current split
>> between SPARQL and TQL.
>
> Well, both translate to the same algebra, so there's not *that* big a
> split between them. The main thing is that they each have syntax for
> accessing their own functionality. For any features that are missing
> from one language but exist in the other, it's feasible to introduce
> syntax to the first language to access that feature.

Great, I hoped that was the case.  I emphasized the split because I was
looking for the simplest way I could proceed now, while minimizing
forking Mulgara -- that's why I was hoping to exploit existing extension
support.

> Loading external libraries is not standard in SPARQL, so I wouldn't
> say that it's a "reasonable SPARQL 1.1 approach". It's also a huge
> hack, but I think it's cool simply because I think it would work fine.
>  :-)

Thanks for validating that for me.

>> Second, could I do this now with TQL?  Is there a way of adding
>> new functions to TQL select?
>
> Sure, it's open source.  :-)

Well, duh :-).  While avoiding forking the project!

> Given how flexible expressions are in SPARQL, then I'm thinking it
> might be nice to add it as a feature to TQL. If we can put them in,
> then they'd also support externally loaded libraries, just like SPARQL
> does.

At this point, my interest in TQL is purely that it supports writes now,
and SPARQL doesn't.  I'm surprised that you're interested in extending
TQL.

Are there features that TQL provides that would warrant using it rather
than SPARQL 1.1 (once available, of course)?

Are you intending to extend TQL going forward, or just provide it for
backward compatibility?

>> Third, do you have a better idea?
>
> The only things I can think of are:
>
> 1. Use the Java API. You can do anything here, but it's not a flexible
> solution.

Right. Not a great choice for me.

> 2. Implement multiple concurrent write transactions. We can then have
> long-running transactions, allowing us to have multiple REST
> operations operating within a single transaction (transactions *can*
> be modelled with REST). Unfortunately, this will be a major
> development effort, and it can't happen soon.

As I think I demonstrated above, concurrent writes wouldn't really
address the problem. The real issue is the interaction between reads and
writes across transactions. Relational databases traditionally handle
this via read-locking or, at the application level, via optimistic
concurrency control, which is the approach I've been exploring.

> 3. Introduce an operation that can occur at the server end to control
> execution path. e.g. if/then/else. This would be cheap to implement,
> but would be non-standard.

Raising the obvious question:  Shouldn't the standard address this
issue?

Adding an if/then/else that could evaluate the results of SELECTS to the
standard would be ideal. Surely I'm not the only person with this issue.

Thanks,
-dms