[Mulgara-dev] ConnectionFactory caching strategy

Thu May 8 19:00:26 UTC 2008

Caveat: I had an IM discussion with Alex about this, and asked him to
repeat it on the mailing list. So I apologize in advance if I sound
like I'm saying something out of context.

On Thu, May 8, 2008 at 1:28 PM, Alex Hall <alexhall at revelytix.com> wrote:
> I've been making extensive use of the Connection API of late, and have
> encountered some issues that make me think it's time to revisit the
> caching strategy used by the ConnectionFactory class.  The strategy used
> right now is straightforward: the factory maintains a map of server URI
> to Connection object, and the first time a connection is retrieved for
> that particular server it is added to the map.  (A similar cache is
> maintained which maps Session to Connection, but a client that is
> retrieving Connections via a Session object has already taken on the
> responsibility of Session management so I think this falls outside the
> scope of this discussion).  The next time a Connection is retrieved for
> that same server URI, the cached Connection is returned.  This approach
> has the advantage of avoiding the overhead of setting up a new RMI
> connection each time.  However, I have encountered the following
> specific issues with this implementation:

The RMI overhead was the main reason I did this. Connections are
supposed to provide an abstraction for access, so other
implementations should not have the same cost. e.g. if we have a
REST-base HTTP Connection in future we won't want to cache it.

> 1.  If one client gets a connection from the factory using the server
> URI, operates on the connection, and closes it, then the next client to
> get a connection to the same server URI will receive a connection that
> is closed and therefore unusable.  This bug has been logged in Trac as
> ticket 106 [1].

The two approaches I have for this are:
1. Don't close the underlying Session, but mark the Connection as
unusable until it is served up from the factory again.
2. Throw away the Connection when it gets closed.

I'm not very keen on 1, but 2 has problems as well. After all,
programmers will typically clean up resources when they're done with
them, meaning they will almost always call Connection.close(). If we
throw away Connections at that point, then there is not going to be
any point to having a cache at all.

If we go with option 1, then there should be a mechanism for telling
the Connection to close the underlying Session, since the user may
need this. At that point the Connection will have to be discarded.
It's just that this wouldn't happen by default with a call to
Connection.close().

> 2.  I am concerned that the RMI Session backing a connection to a remote
> server might time out if enough time elapses between successive client
> calls to the factory to get a connection.

I'll need to check out how RMI timeouts are managed. I know from
experience that RMI connections can run for a very long time. Still,
if something is in the cache for more than a few minutes then we can
probably discard it. Heavy use of the factory is the main reason for
needing a cache.

> 3.  I can think of cases where the calling code might want to get a
> Connection that is not shared with others.  Specifically, I am thinking
> of a multi-threaded application where the calling code will want to get
> a connection, turn off auto-commit, modify a graph, and commit its
> changes without interference from other threads.

Access to the factory ought to be synchronized. I'm embarrassed that I
overlooked this (sorry everyone).

Once a Connection has been handed off to a client, I believe it is up
to that client to manage synchronization for access to that
Connection. There shouldn't be any interference between threads unless
the client who asked for that Connection chooses to share the object
between threads - in which case it's the client problem.

> Addressing the first two issues should not be all that difficult; it
> should be fairly straightforward to check a Connection after getting it
> from the cache and discarding it if it has been closed or timed out.
>
> Addressing the third issue is slightly more involved.  The easiest
> implementation would be to provide a new method in ConnectionFactory to
> get a new connection that is not cached, but heavy use of such a method
> would lose the benefits that caching provides.  A better way might be to
> maintain information in the cache about which thread is currently using
> a connection, and not allow any connection to be in use by multiple
> threads simultaneously.  In this case, a new method could be added to
> release a connection for use by other threads without explicitly closing
> it, which would keep some of the benefits of caching.

I can't recall if RMI has an issue with sharing references to remote
objects between client threads. If so, then some artificial tying of
Connections may be needed. But for other Connection types (like HTTP)
then this would not be necessary, nor desirable.

A local Connection (one whose session is in the local JVM, and not on
the server - there should be some way for Connection factories to know
about this) may need some thread localization, as I believe that the
transaction manager (rightly) requires that Sessions stay in a single
thread.

Andrae knows more about these sorts of restrictions for Session.

> I can implement any necessary changes to the factory since it's
> important for the work I'm doing.  I just wanted to open this up for
> conversation before I start coding away.  Please let me know if you have
> any thoughts on the subject.

The more coding that other people do, the happier I am.  :-)

Before going ahead on the timeout issue, check out how RMI gets timed
out, specifically if it's something that we've done or if it's the RMI
system itself. If it's something we've done, then perhaps we can hook
into it to detect when to discard a Connection in the cache.

In the worst case, perhaps we could add a ping method to the Session interface:
  boolean ping() { return true; }
An established Session will return very quickly, and presumably the
Connection is being requested for network activity anyway, so a slight
delay may be acceptable. It'd be better to be able to detect dead
connections without explicit network activity though.

Regards,
Paul