CARCOSA

Wow, it's been a long day.

This has definitely been my best coding day on Divmod yet this month. Into the wee hours of the morning, Allen and I tore through the rest of the tasks which I had estimated it would take the rest of the release to complete.

We also had a fun discussion about the successor to twisted.world, CARCOSA (backronymed to "Concurrent, Atomic, Reliable Object Storage Architecture"). When we can refactor Quotient's store into a bit more general of a structure, and also take advantage of what must be a significant speed boost in using fixed-length rather than variable records in bsddb, we should be able to implement everything that we had hoped for in twisted.world and possibly more.

We're probably going to need to add a twisted.python.schema module, which should be shared between Formless, CARCOSA and PB. While CARCOSA is dealing about storage and Formless/PB is dealing with interactivity, you should be able to express type constraints the same way.

An object which implements a TypedInterface and also specifies __schema__ itself will be able to be published on the web, transactionally stored in a database, and also published over a custom, interactive protocol, all with no additional work besides what the average Java programmer has to endure when defining a class. Plus, you can develop your objects unencumbered by the schema and only nail them down once you've developed an understanding of their requirements through experimentation.

Quick Hacking

Well, it took most of the day, but once Allen pointed out a stupid error I was making by ignoring a potential error condition, the first pass on bsddb/store integration is already committed to Quotient! Plus, I took a nap today, so I should have a few more good hours of programming left to get itempool plugged in.

Where Flap the Patents of the King, Pixels Unrendered Must Die Unseen

In keeping with the theme of this journal, I decided to update the picture. Some of you may recognize the icon associated with this post :).

Strangely it did not work until I uploaded an LZW-compressed GIF. It didn't work as either a PNG output from Gimp, from pngcrush, from convert or from sodipodi, nor a JPG from convert or gimp. I suppose you have to mix the evil of the Sign itself with the evil of the Unisys patent

Identity Crisis

Well! It turns out that the bsddb conversion was pretty easy after all. (At least, moving object storage to it.) Now all I have to do is track down this stupid identity management issue and I'm all set.

The only problem is that somehow, objects are sometimes being doubly instantiated when they should only be instantiated once, and sometimes they're not being re-instantiated when they should be garbage collected and re-created.

In the particular issue I'm debugging at the moment, the object relationship is SUPPOSED to be:

stack : item
stack : pool : store : weak cache : item

and since the stack is holding a strong reference to the item, it shouldn't go away and be re-created, but somewhere in there there's a mistake.

Is there any common pattern, either for implementing or debugging these kinds of "this object can be garbage collected back to storage but it really only exists once" kind of things?

Crouching Database, Hidden Transaction

Today I actually discovered BSDDB. I can't believe I missed this. It's an efficient, in-process, open-source, simple, transactional database which does almost everything I have ever wanted a database to do. For free.

I read the transaction store documentation at sleepycat's site, and discovered that bsddb does things I hadn't even considered - for example, it is possible to transactionally move a non-database file using the bsddb API. I realize that this may not mean much to anyone but me, but I was so moved by that possibility that I nearly cried.

This discovery is both exciting and terrifying. Allen and I ran into this together, pair programming and attempting to reconcile the highly single-process logic in the current Quotient with the highly multi-process logic in the new QQ (Quotient Queues) module we're integrating. After looking at one particular function that was obviously fragile and could lose data at several points, we decided we needed to solidify and centralize our transaction processing into a single place. Since we knew that BSDDB had "some transaction support", we figured we could use it for what we needed.

We got more than we bargained for. One of the first things that we discovered that we had previously mis-read the documentation: we believed that BSDDB was single-process, based on the fact that their documentation talked about multi-reader access only being available for non-transactional data stores, and other areas referred to "threads of control".

It turns out that it's perfectly usable from multiple processes simultaneously. "Thread of control" actually means "process, thread, or other encapsulation of a program counter" the way they use it in their documentation.

So, on the one hand, it will make the incredibly arduous task of making our central data store 100% reliable much, much easier than it previously was. Rather than being an ongoing task with many threads left hanging, it will be almost completely done when we finish this refactoring. On the other, this increases the amount of changes we have to get done for this release - by saturday. This is a much larger snag than I expected to hit at this point, considering that we'd already gotten through a lot of the "hard stuff" - figuring out how to make multi-process communication both transactional and observable from the user interface. (Luckily, much of that work won't be wasted since we will be using it to manage transactions across multiple machines.)