10k of RSS in 1M of Screenshots

Today Exarkun and I added some features to Divmod that I am style="font-style: italic;">really looking forward to actually
using.  They haven't made it to production yet, but I don't write
about our product near often enough, so I think a bit of an
introduction is in order.



The first feature we added was the bookmark manager:



src="http://www.divmod.org/users/glyph/RSSShots/Screenshot-Bookmarklet%20Manager%20-%20Mozilla%20Firefox.png">



By itself this wouldn't be a particularly interesting screenshot, but I
wanted to stress how it kind of looks like the "bookmark" window in
Mozilla.  After adding the "bookmark to Divmod" button to your
browser, you can bookmark stuff similarly to del.icio.us.



Similarly, your list of bookmarks becomes a feed.  style="font-style: italic;">Unlike del.icio.us, this feed is
private: the "live bookmark" only works when you're logged in, or if
you have explicitly given a client program your password as part of the
URL.  One of the next things I'll be working on is finishing up
our sharing code and integrating it with this, so that you can share
certain portions of your feed with different groups of people.



src="http://www.divmod.org/users/glyph/RSSShots/Screenshot-Divmod.Org%20%3A%3A%20Home%20-%20Mozilla%20Firefox.png">



We hit a website in the browser, click on "bookmark to Divmod", and
voila: it has been bookmarked (note that it's been bookmarked style="font-style: italic;">securely, even...)



src="http://www.divmod.org/users/glyph/RSSShots/Screenshot-Divmod.Org%20%3A%3A%20Home%20-%20Mozilla%20Firefox-1.png">

The bookmark completes,



We update our RSS feed,



and the new bookmark has appeared.



I like that feature since it means that I can bookmark whatever I want
and store it centrally in the same place I keep my email, but an even
better feature is that Divmod can now turn style="font-style: italic;">any Pool (equivalent of a "folder",
or "label" depending on your point of view) into a private
feed.   Here we can see the test inbox with the "live
bookmark" icon and  all the same subjects showing up over RSS and
HTTP.



My favorite part of this was that it only took about 4 hours to do, and
although it's for a demo I feel like the code is pretty
production-worthy: it's a very simple re-application of our existing
back-end.

I rarely know what I'm talking about...

... but I know somebody who does.

A question worth asking: why does Python have an "integer" type, but no "character" type? Especially when the serious users of numbers in pythonenhanced it, they did everything in terms of arrays.

I Can Draw Graphs Too

So, I set out to provide a slightly more cogent debunking, since various members of the audience were unsatisfied with my previous, dismissive response. I haven't gotten terribly far, since I'm pretty busy, but I did cook up some rudimentary results for you.

My goal was to show that threads don't scale terribly well in terms of processing concurrent I/O, and in particular, Twisted scales better, even in its slowest, most naive mode than a simple threaded implementation of a similar concept.

Unfortunately I was unable to do this easily, because Debian GNU/Linux (in its default configuration, using a 2.6.7 kernel) is not able to even spawn enough threads to generate a fair test.

My test program was to write a Twisted protocol handler that would take some simple input, perform a trivial computation, generate some output, and close the connection. I then wrote a Twisted client for it, and write a simple threaded reactor which would handle each connection in a separate thread, to compare the amount of time each multiplexor took to finish connecting vs. the amount of time it took to complete the entire test. At around 256 threads, though, I could no longer spawn more threads to handle simultaneous connections, a problem which others have experienced.

This indicates in part the cultural issues surrounding the performance of threads, to wit, nobody really cares enough about good threading performance to even bother making it possible to spawn a lot of them.

This test was carried out on an otherwise quiescent Linux machine, a 2000XP+ Athlon with a gig of RAM, using python 2.3. Without further ado:




I don't expect these graphs to be taken as serious performance measurements - after all, I'm not even providing source code yet. Still, the fact that the threading connection-start performance is so abysmally bad (even compared to select() - remember, this is using Twisted's default configuration, with no tuning or optimization) and the fact that with an otherwise excellent development machine I don't even have the software to test larger numbers of concurrent connections might give you an idea of why I find it so hard to take this kind of comparison seriously.

It has brought to light the fact that although Twisted would be an excellent test platform for different multiplexing mechanisms, nobody has bothered to put together a survey of those yet. I will do a more rigorous benchmark of threads vs. Twisted later this week, release the code, and hopefully some useful work will come of this if it can be used as a simple reactor benchmark.

A grotesque parody of science

Unbelievably naive mistake, or politically motivated lies? You be the judge.

I have already spent way too much time talking about this piece of garbage, so I am putting up a little persistent comment about it.

This so-called "benchmark" is nothing more than an insult. Yes, he has graphs: but these graphs are not actually measuring anything. It claims to be performance metrics on an MDI application doing image transformations.

The advocacy position which he is so anxious to misinterpret is this: Twisted is an alternative to threads for multiplexed I/O. Many, many applications multiplex I/O, and they do it in a variety of ways which are inefficient and bug-prone, the most popular (and most dangerous) approach being thread-per-connection. There is an explanation of the problem (as well as a link to Twisted) at threading.2038bug.com.

He accuses Twisted developers of letting emotions get in the way. Am I emotional about this? Yes, I am. I am angry about it for good reason. The Twisted team has spent almost a hundred man-years of effort producing something that we could be proud of, to provide our users with a better, more test-friendly, less error-prone way of programming. It's not a new idea, but I like to think we've brought something useful to the table by providing an integrated platform that supports it.

I don't think that my efforts, or those of my colleagues, deserve to be misrepresented in this way. I am upset and writing about this because I am afraid that programmers who don't know any better will see graphs and think that he's proposing a legitimate approach to solve some problem, and I will one day be called upon to help them with their code. Code which will, thanks to "benchmarks" like this one, inevitably, perform extremely poorly and be a total mess of race conditions. I am trying to improve the state of the art in the industry, and it is a lot of work. Widespread circulation of an afternoon's dalliance with a plotting program like the above can undo years of work trying to educate people.

I thought that I would write some benchmarks where the graphs went in the other direction, but the internet is already brimming with graphs that demonstrate the superior scalability of multiplexing with events rather than threads. Dignifying this sophomoric potshot with actual data would be a lot more than it deserves. If you are truly skeptical, I would recommend attending Itamar Shtull-Trauring's Twisted talk at PyCon 2005, "Fast Networking with Python". This will give you a much clearer view of Twisted's current performance problems than the graphs drawn by some script-kiddie who has a grudge because he got banned from an IRC channel for spreading lies.

Just as refuting the numbers would be a waste of effort, refuting every one of the lies and/or misunderstandings in each paragraph could take all day. Instead, I'll just debunk a sample of the most egregious stuff, and hopefully the pattern won't be too hard to extrapolate, for those of you unfamiliar with the subject matter.

Threads will work on multiprocessor and hyperthreading machines automatically. On similar hardware Twisted will use only 1/n of the available processing power, where n is the number or virtual or physicial [sic] processors.

What he means is, python will only use 1/n of the available processing power. For code that actually makes use of SMP, you need to relinquish the global interpreter lock, and write all parallelized code in C. This is thanks to the global interpreter lock, a problem which is hard to solve, since making Python multi-CPU friendly actually makes it slower. Note - you don't need to do anything special to take advantage of SMP if you use Twisted's recommended, non-threaded way of parallelizing things, which is spawning multiple worker subprocesses.

So, Twisted can use exactly as much processing power as Python, especially because Twisted supports threads.

Twisted people think you should spawn a separate process for intensive work. If you do this you need to synchronize resource access as you would with threads. That is probably the most mentioned problem with threads.

You don't need to synchronize resource access when you use subprocesses. You can copy data to multiple subprocesses and serialize resource access, without doing any extra work, since Twisted will deliver attempts at resource access through normal I/O delivery channels, which are processed by the main-loop. Also, you can run your subprocesses in isolation, without concern for synchronized access to shared data structures. In the case which he seems to be talking about, e.g. that of large shared memory objects which need to be mutated and then operated upon by multiple cooperating processes, you can still avoid locking by using a tuple-space model of interaction and delivering work to subprocesses through pipes rather than delivering data. This still only has one simple program managing the interactions of many, taking advantage of the OS's much-vaunted thread-switching abilities, and doesn't require mutexes on every operation.

You also need to find an effective portable method of inter-process communication. This is not much of a problem, but it is something you wouldn't have to do with threads.

You certainly have to do this with threads. It might appear as though you do not, and in some cases it may be slightly easier to implement, but if you don't track which objects are in use by which threads, mutex overhead will quickly cripple any performance gains that you'd see in a threaded application. So your IPC mechanism with threads is queues, or condition variables, rather than pipes or shared memory, but it's still there and it still requires a lot of maintenance.

In conclusion, I stand by one of arensito's last claims:

When I approached Twisted people with questions about these results I was told I was not worth listening to. Followers stated bluntly they were smarter than me.

In fact, he isn't worth listening to, and I am proud to say that the Twisted team is smarter than he is. More than that: he's worth not listening to. The "benchmark" that he has proposed does not test anything about Twisted, and does not test anything meaningful about his hypothesis.

There's lots of work to be done on Twisted, and it certainly has its share of performance problems. It's by no means the fastest system of its kind. I am always excited to hear about ways it can be improved, but don't just make up a bunch of lies, write a while loop, slap a graph on it, and claim you've discovered something better.

every program does this, here is how

There is a pretty common thing that various applications do now, generally touted as a major feature, such as "tabbed browsing" or "single-window chat" or whatever. Basically applications start out like this:

MDI Windows

and then at some point later they do this:

MDI Notebook

I have been spending some of my spare time messing around with PyGTK, so here is some sample code in Python which does this thing.




import gtk

from zope.interface import Interface, implements, Attribute

class IMultiPlug(Interface):
"""Implements multi-plug stuff.
"""

def addPlug(plug, position=0):
"""
Add plug.

@param plug: implementor of IPlug

@param position: int - useful when using tabs or other visibly ordered
MPI interfaces
"""

def removePlug(plug=None):
"""
Remove a plug.

@param plug: implementor of IPlug
"""

def iterPlugs():
"""Return an iterator of all plugs
"""

class IPlug(Interface):
gtkwidget = Attribute('a gtk.Widget')
name = Attribute('a unicode name')
icon = Attribute('a gdk.Pixmap')

class Plug:
implements(IPlug)
def __init__(self, name, gtkwidget, icon=None):
self.name = name
self.gtkwidget = gtkwidget
if icon is None:
icon = gtk.icon_theme_get_default().load_icon(gtk.STOCK_YES,
64, 0)
self.icon = icon

class MultiWindow:
implements(IMultiPlug)

def __init__(self, prefix = ''):
self.windows = {}
self.plugs = []
self.prefix = prefix

def addPlug(self, plug, position=0):
win = gtk.Window()
win.set_title(self.prefix + plug.name)
self.windows[plug] = win
win.add(plug.gtkwidget)
win.show_all()
self.plugs.insert(position, plug)
def killit(ev):
self.removePlug(plug)
win.connect("delete_event", killit)
win.set_icon(plug.icon)

def iterPlugs(self):
return self.plugs[:]

def removePlug(self, plug):
for enum, oplug in enumerate(self.plugs):
if plug is oplug:
self.windows[plug].remove(plug.gtkwidget)
self.windows[plug].destroy()
self.plugs.pop(enum)
self.windows.pop(plug)
return

class MultiTab:
implements(IMultiPlug)

def __init__(self, notebook, window):
self.notebook = notebook
self.window = window
self.plugs = []

def labelFactory(self, plug):
hb = gtk.HBox()

# Icon box
icbox = gtk.Image()
scaled = plug.icon.scale_simple(16, 16, gtk.gdk.INTERP_BILINEAR)
icbox.set_from_pixbuf(scaled)
icbox.set_padding(2, 0)
hb.add(icbox)

# Label
hb.add(gtk.Label(plug.name))

# Close Button
cb = gtk.Button()
cb.set_relief(gtk.RELIEF_NONE)
im = gtk.Image()
im.set_from_pixbuf(gtk.icon_theme_get_default().load_icon(gtk.STOCK_CLOSE,
12, 0))
im.show_all()
cb.add(im)
hb.add(cb)
hb.set_child_packing(im, False, False, 0, gtk.PACK_START)

def killit(ev):
self.removePlug(plug)
cb.connect('clicked', killit)

return hb

def addPlug(self, plug, position=0):
lab = self.labelFactory(plug)
self.notebook.insert_page(plug.gtkwidget, lab, position)
self.plugs.insert(position, plug)
plug.gtkwidget.show_all()
lab.show_all()
self.window.show_all()

def removePlug(self, plug):
for enum, oplug in enumerate(self.plugs):
if oplug is plug:
self.plugs.pop(enum)
self.notebook.remove_page(enum)
break
if not self.plugs:
self.window.hide()

def iterPlugs(self):
return self.plugs[:]


class Cycler:

def __init__(self, mdi1, mdi2):
self.mdi1 = mdi1
self.mdi2 = mdi2
win = gtk.Window()
win.set_title("Cycler")
but = gtk.Button("cycle MDI")
but.set_size_request(200,200)
win.add(but)
win.connect('destroy', gtk.main_quit)
but.connect('clicked', self.swap)
win.show_all()

def swap(self, ev):
for plug in self.mdi1.iterPlugs():
self.mdi1.removePlug(plug)
self.mdi2.addPlug(plug)
self.mdi1, self.mdi2 = self.mdi2, self.mdi1

def test():
mdi1 = MultiWindow('MDI Demo: ')
n = gtk.Notebook()
nw = gtk.Window()
nw.set_title("MDI Notebook")
nw.add(n)
mdi2 = MultiTab(n, nw)

widgets = [gtk.Button('Button %d' % x) for x in range(4)]
import random
itheme = gtk.icon_theme_get_default()
for e, widget in enumerate(widgets):
widget.set_size_request(300,200)
mdi1.addPlug(Plug('Plug #%d'%e, widget, itheme.load_icon(random.choice(itheme.list_icons()),64,0) ))
Cycler(mdi1, mdi2)
gtk.main()


if __name__ == '__main__':
test()