Talk:PluggableStorage

From DSpace Wiki

Jump to: navigation, search

I'm working on an RDBMS bitstore implementation, where the bitstreams are stored as blobs, for a prototype at my company.

I ran into a problem when I tried to implement the BitStore interface as presented here, because it implies that the put, remove, about, and get (particularly get) methods are independent of the calling context. In my first stab, I had these methods getting a database connection at the beginning and releasing it at the end, but for the get method, this meant the connection, and therefore the input stream as well, was closed before the caller could start trying to read the stream.

One possible solution might have been to write the blob out to a temporary file and then connect the stream to the file, but that brought a nest of problems with it (how much temporary space? how will it perform? how do the temporary files go away?).

The solution I've come up with is to have the BitstreamStorageManager methods pass in their Context object when they call the BitStore methods. The BitStore methods then get their database connections from the context. This solution means that the bitstreams are in the same database and schema as the metadata, and there is no way to configure multiple bitstores. It also means changing the BitStore interface method declarations to include a Context argument, which is, of course, of no use to the three existing implementations.

Another possibility we're considering is to perhaps have the DatabaseManager (maybe) manage a second, separately configurable, database for the bitstreams, and provide connection and statement pools for it just as it does for the metadata classes. The Context object would hold another connection from that pool that the BitStore methods could use. This would allow multiple bitstores. It still requires passing a Context to the BitStore methods.

Has anybody already developed an elegant solution to this problem?

Personal tools