[Mulgara-dev] Query cancellation

Alex Hall alexhall at revelytix.com
Mon Jan 18 22:37:58 UTC 2010


Hi Paul,

On 1/18/2010 3:47 PM, Paul Gearon wrote:
> Hi Alex,
> 
> On Mon, Jan 18, 2010 at 3:21 PM, Alex Hall <alexhall at revelytix.com> wrote:
>> I brought up this issue a few months back, and it's time to revisit it.
>> This is going to be a long email, so please bear with me...
>>
>> We've recently added a graphical query builder to our application, which
>> allows users to write and execute arbitrary SPARQL queries against an
>> RDF graph stored in Mulgara. What we've found is that general users are
>> quite adept at writing queries that, for a variety of reasons (such as
>> unintended cross-products), will bring the system to its knees. We need
>> a way of canceling such misbehaved queries and releasing system
>> resources on the server in order to mitigate this situation.
> 
> Makes sense.
> 
> Of course, it would be nice if the server didn't slow down so much in
> the first place. However, no matter how good the optimizers get, it
> will always be possible to do something that will slow a server down.

True, it would be nice if the server didn't slow down so much. But even
a mature technology such as JDBC has found it necessary to provide the
Statement.cancel() method. As you say, no matter how good the optimizers
get, there will always be the pathological case.

[snip]
>> A more likely approach is illustrated in a small open-source utility
>> library that I found called Interruptible RMI
>> (https://interruptiblermi.dev.java.net/). This library uses custom
>> thread and socket factories, and overrides Thread.interrupt() to close
>> the underlying socket if the thread is blocked in an RMI call. It's a
>> somewhat blunt approach, but also simple and effective.
> 
> This is interesting. I don't know if it will extend to the REST
> interfaces, but it's an important step all the same. In any case, the
> required support at the server side will be useful for both
> interfaces.

It would be nice if the servlet container could detect that the client
has abandoned a request and notify you, but I have no idea whether or
not that is the case.

[snip]
>> One caveat regarding this approach is that the thread executing the
>> client call must have been created using the custom thread factory. For
>> that reason, I'm envisioning adding some sort of client proxy Session
>> implementation that would invoke operations in separate threads created
>> using the custom factory in order to hide that detail from client code.
>> Of course, clients could use the thread factory directly if they so choose.
> 
> You should be able to hide most of that in the Connection factory,
> shouldn't you? The idea of this factory was to make establishing RMI
> connections easier.

Sure, the ConnectionFactory would be one likely location to hide that
sort of thing. No matter where it gets implemented, I imagine that there
will need to be some sort of configuration option that allows the client
to decide whether to have the overhead of executing operations in
interruptible delegate threads. Unfortunately the only way I know of to
configure anything in the client libraries is going to be with Java
system properties.

>> That brings me to my next issue. The ability to cancel an operation is
>> an important feature to have when it comes to building
>> production-quality systems. If we're going to add this feature, then I
>> think it would be a good idea to advertise its availability and make it
>> more accessible. In the case of the Interruptible RMI library,
>> overriding Thread.interrupt() is a low-level implementation detail and
>> not really suitable as a general-purpose interface.
> 
> I think that EVERY feature in an open source project should be
> advertised. The only reason that doesn't happen now is lack of
> resources for writing documentation. :-(
> 
> So, yes. By all means advertise it!  :-)

Well, my definition of "advertise" it in this case is to add the feature
as a method of a higher-level API to make it easier for developers to
discover it. But I see your point. :-)

>> Since I'm working with the Connection API, I would suggest extending the
>> Command interface with a cancel() method. The general contract would be,
>> if Command.execute() is running in one thread and Command.cancel() is
>> called in another thread, then execute() will return immediately (with
>> some sort of exception) and any server resources will be released as
>> soon as is reasonably possible. Using the Interruptible RMI library, it
>> would be fairly straightforward to make an abstract command class that
>> will register the thread that invokes execute() and then interrupt that
>> thread in cancel().
> 
> I believe that a Connection only allows you to run a single Command at
> a time on it, right? (you can certainly only run a single write
> command, but I never gave much thought to multiple queries). If so,
> then I'd prefer to see cancel() on the Connection (commands get run on
> a connection anyway, so Command.cancel() can easily just pass this
> onto a Connection that they're currently running on).

I don't think this is the case. The Connection factory is designed so
that multiple threads attempting to open connections to the same server
would get different connections, but the motivation for this was to
avoid having one worker close a connection still in use by another
worker. Once you have a Connection, though, there is certainly nothing
in the Connection class to prevent it from being used to invoke multiple
commands concurrently in multiple threads.

Behind the scenes, what happens when you get a new Connection is that
the server will create a new DatabaseSession, wrap it in a
RemoteSession, export the session, and return a stub to the client. The
Connection is just a thin wrapper around the remote session stub;
concurrent commands on the Connection will result in the remote
DatabaseSession being accessed concurrently by multiple RMI threads. If
anything is going to prevent concurrent operations from running at once,
it is going to be the transaction factory because the DatabaseSession at
first glance appears to support concurrent access.

Given the fact that it is (at least theoretically) possible to have
concurrent Commands being run on a single Connection, I would prefer to
have cancel() implemented at the Command level.

> The reason I'd like to see Connection.close() instead of
> Command.close() is because Connections have a more global scope
> (they're cached for a start), which Commands are (currently) invisible
> outside of their current context. That makes them harder to pick up
> from another thread to call cancel() on them. Conversely, since
> Connections are cached and indexed by server, you can ask for the list
> of Connections to a given server, and figure out which one you want to
> cancel() the current operation on.

There is already a Connection.close() method that will just release it
back to the factory, keeping the underlying Session open for re-use.
There is also a Connection.dispose() method that will destroy the
underlying Session, which will in turn cause the DatabaseSession on the
server to immediately abort all associated transactions. However, I
don't think that aborting a transaction alone will in any way cause the
operation associated with that transaction to be interrupted. You could
probably extend that to actually cancel the current operations, but
that's beyond the scope of what I'm trying to do. I'm only trying to
cancel one operation at a time.

>> Finally, there is the issue of how to terminate the server processing to
>> release resources. The general problem here is, there is no way to
>> gracefully interrupt processing if the server isn't expecting to be
>> interrupted to begin with. This suggests that the server process (in
>> this case, the query execution thread) needs to periodically check
>> whether the request was canceled and respond accordingly. The "how" part
>> of detecting a cancellation is simple enough, at least with the
>> Interruptible RMI library; there is a utility method available for a
>> server to check whether its underlying socket is still alive.
>>
>> As far as when to do this detection, that's going to be a bit of a
>> balancing act. For maximum responsiveness, you would need to perform
>> this check very frequently which would mean lots of code modifications.
>> Ideally, there will be some central control point where you can do the
>> check on a regular basis with minimal code changes. For the Mulgara
>> query engine, the most likely candidate seems to be the
>> ConstraintResolutionHandler and GraphResolutionHandler dispatch tables
>> in the ConstraintOperations class. I think a good first step would be
>> adding a check in here to see if the request has been canceled, and if
>> so throw a QueryException which will quickly unwind the call stack,
>> close any open resources, and roll back the transaction.
> 
> Yes, it needs to be central. I'll trust you on picking the place for
> it. Unfortunately, there are all sorts of places where it will need to
> be tested, though sort() is the most obvious.

Yes, that occurred to me as well. I'll have to do some investigation to
determine the best place within sort() to cancel the processing, but
once we have it working in one place it should be easy enough to
reproduce it elsewhere.

Regards,
Alex



More information about the Mulgara-dev mailing list