Wow, it's been a long day.
This has definitely been my best coding day on Divmod yet this month. Into
the wee hours of the morning, Allen and I tore through the rest of the tasks
which I had estimated it would take the rest of the release to complete.
We also had a fun discussion about the successor to twisted.world, CARCOSA
(backronymed to "Concurrent, Atomic, Reliable Object Storage Architecture").
When we can refactor Quotient's store into a bit more general of a
structure, and also take advantage of what must be a significant speed boost
in using fixed-length rather than variable records in bsddb, we should be
able to implement everything that we had hoped for in twisted.world and
possibly more.
We're probably going to need to add a twisted.python.schema module, which
should be shared between Formless, CARCOSA and PB. While CARCOSA is dealing
about storage and Formless/PB is dealing with interactivity, you should be
able to express type constraints the same way.
An object which implements a TypedInterface and also specifies __schema__
itself will be able to be published on the web, transactionally stored in a
database, and also published over a custom, interactive protocol, all with
no additional work besides what the average Java programmer has to endure
when defining a class. Plus, you can develop your objects unencumbered by
the schema and only nail them down once you've developed an understanding of
their requirements through experimentation.
Well, it took most of the day, but once Allen pointed out a stupid error I
was making by ignoring a potential error condition, the first pass on
bsddb/store integration is already committed to Quotient! Plus, I took a nap
today, so I should have a few more good hours of programming left to get
itempool plugged in.
In keeping with the theme of this journal, I decided to update the picture.
Some of you may recognize the icon associated with this post :).
Strangely it did not work until I uploaded an LZW-compressed GIF. It didn't work as either a PNG output from Gimp, from pngcrush, from convert or from sodipodi, nor a JPG from convert or gimp. I suppose you have to mix the evil of the Sign itself with the evil of the Unisys patent
Strangely it did not work until I uploaded an LZW-compressed GIF. It didn't work as either a PNG output from Gimp, from pngcrush, from convert or from sodipodi, nor a JPG from convert or gimp. I suppose you have to mix the evil of the Sign itself with the evil of the Unisys patent
Well! It turns out that the bsddb conversion was pretty easy after all. (At
least, moving object storage to it.) Now all I have to do is track down this
stupid identity management issue and I'm all set.
The only problem is that somehow, objects are sometimes being doubly instantiated when they should only be instantiated once, and sometimes they're not being re-instantiated when they should be garbage collected and re-created.
In the particular issue I'm debugging at the moment, the object relationship is SUPPOSED to be:
stack : item
stack : pool : store : weak cache : item
and since the stack is holding a strong reference to the item, it shouldn't go away and be re-created, but somewhere in there there's a mistake.
Is there any common pattern, either for implementing or debugging these kinds of "this object can be garbage collected back to storage but it really only exists once" kind of things?
The only problem is that somehow, objects are sometimes being doubly instantiated when they should only be instantiated once, and sometimes they're not being re-instantiated when they should be garbage collected and re-created.
In the particular issue I'm debugging at the moment, the object relationship is SUPPOSED to be:
stack : item
stack : pool : store : weak cache : item
and since the stack is holding a strong reference to the item, it shouldn't go away and be re-created, but somewhere in there there's a mistake.
Is there any common pattern, either for implementing or debugging these kinds of "this object can be garbage collected back to storage but it really only exists once" kind of things?
Today I actually discovered BSDDB. I can't believe I missed this.
It's an efficient, in-process, open-source, simple, transactional database
which does almost everything I have ever wanted a database to do. For
free.
I read the transaction store documentation at sleepycat's site, and discovered that bsddb does things I hadn't even considered - for example, it is possible to transactionally move a non-database file using the bsddb API. I realize that this may not mean much to anyone but me, but I was so moved by that possibility that I nearly cried.
This discovery is both exciting and terrifying. Allen and I ran into this together, pair programming and attempting to reconcile the highly single-process logic in the current Quotient with the highly multi-process logic in the new QQ (Quotient Queues) module we're integrating. After looking at one particular function that was obviously fragile and could lose data at several points, we decided we needed to solidify and centralize our transaction processing into a single place. Since we knew that BSDDB had "some transaction support", we figured we could use it for what we needed.
We got more than we bargained for. One of the first things that we discovered that we had previously mis-read the documentation: we believed that BSDDB was single-process, based on the fact that their documentation talked about multi-reader access only being available for non-transactional data stores, and other areas referred to "threads of control".
It turns out that it's perfectly usable from multiple processes simultaneously. "Thread of control" actually means "process, thread, or other encapsulation of a program counter" the way they use it in their documentation.
So, on the one hand, it will make the incredibly arduous task of making our central data store 100% reliable much, much easier than it previously was. Rather than being an ongoing task with many threads left hanging, it will be almost completely done when we finish this refactoring. On the other, this increases the amount of changes we have to get done for this release - by saturday. This is a much larger snag than I expected to hit at this point, considering that we'd already gotten through a lot of the "hard stuff" - figuring out how to make multi-process communication both transactional and observable from the user interface. (Luckily, much of that work won't be wasted since we will be using it to manage transactions across multiple machines.)
I read the transaction store documentation at sleepycat's site, and discovered that bsddb does things I hadn't even considered - for example, it is possible to transactionally move a non-database file using the bsddb API. I realize that this may not mean much to anyone but me, but I was so moved by that possibility that I nearly cried.
This discovery is both exciting and terrifying. Allen and I ran into this together, pair programming and attempting to reconcile the highly single-process logic in the current Quotient with the highly multi-process logic in the new QQ (Quotient Queues) module we're integrating. After looking at one particular function that was obviously fragile and could lose data at several points, we decided we needed to solidify and centralize our transaction processing into a single place. Since we knew that BSDDB had "some transaction support", we figured we could use it for what we needed.
We got more than we bargained for. One of the first things that we discovered that we had previously mis-read the documentation: we believed that BSDDB was single-process, based on the fact that their documentation talked about multi-reader access only being available for non-transactional data stores, and other areas referred to "threads of control".
It turns out that it's perfectly usable from multiple processes simultaneously. "Thread of control" actually means "process, thread, or other encapsulation of a program counter" the way they use it in their documentation.
So, on the one hand, it will make the incredibly arduous task of making our central data store 100% reliable much, much easier than it previously was. Rather than being an ongoing task with many threads left hanging, it will be almost completely done when we finish this refactoring. On the other, this increases the amount of changes we have to get done for this release - by saturday. This is a much larger snag than I expected to hit at this point, considering that we'd already gotten through a lot of the "hard stuff" - figuring out how to make multi-process communication both transactional and observable from the user interface. (Luckily, much of that work won't be wasted since we will be using it to manage transactions across multiple machines.)