Unbelievably naive
mistake, or politically motivated lies? You be the judge.
I have already spent way too much time talking about this piece of garbage,
so I am putting up a little persistent comment about it.
This so-called "benchmark" is nothing more than an insult. Yes, he has
graphs: but these graphs are not actually measuring anything. It claims to
be performance metrics on an MDI application doing image
transformations.
The advocacy position which he is so anxious to misinterpret is this:
Twisted is an alternative to threads for multiplexed I/O. Many, many
applications multiplex I/O, and they do it in a variety of ways which are
inefficient and bug-prone, the most popular (and most dangerous) approach
being thread-per-connection. There is an explanation of the problem (as well
as a link to Twisted) at threading.2038bug.com.
He accuses Twisted developers of letting emotions get in the way. Am I
emotional about this? Yes, I am. I am angry about it for good reason. The
Twisted team has spent almost a hundred man-years of effort producing
something that we could be proud of, to provide our users with a better,
more test-friendly, less error-prone way of programming. It's not
a new idea, but I like to think we've brought something useful to the
table by providing an integrated platform that supports it.
I don't think that my efforts, or those of my colleagues, deserve to be
misrepresented in this way. I am upset and writing about this because I am
afraid that programmers who don't know any better will see graphs and think
that he's proposing a legitimate approach to solve some problem, and I will
one day be called upon to help them with their code. Code which will, thanks
to "benchmarks" like this one, inevitably, perform extremely poorly and be a
total mess of race conditions. I am trying to improve the state of the art
in the industry, and it is a lot of work. Widespread circulation of an
afternoon's dalliance with a plotting program like the above can undo years
of work trying to educate people.
I thought that I would write some benchmarks where the graphs went in the
other direction, but the internet is already brimming with graphs that
demonstrate the superior scalability of multiplexing with events rather than
threads. Dignifying this sophomoric potshot with actual data would be a lot
more than it deserves. If you are truly skeptical, I would recommend
attending Itamar Shtull-Trauring's Twisted talk at PyCon 2005, "Fast
Networking with Python". This will give you a much clearer view of Twisted's
current performance problems than the graphs drawn by some script-kiddie who
has a grudge because he got banned from an IRC channel for spreading
lies.
Just as refuting the numbers would be a waste of effort, refuting every one
of the lies and/or misunderstandings in each paragraph could take all day.
Instead, I'll just debunk a sample of the most egregious stuff, and
hopefully the pattern won't be too hard to extrapolate, for those of you
unfamiliar with the subject matter.
Threads will work on multiprocessor and hyperthreading machines
automatically. On similar hardware Twisted will use only 1/n of the
available processing power, where n is the number or virtual or physicial
[sic] processors.
What he means is, python will only use 1/n of the available
processing power. For code that actually makes use of SMP, you need to
relinquish the global interpreter lock, and write all parallelized code in
C. This is thanks to the global interpreter lock, a
problem which is hard to solve, since making Python multi-CPU friendly
actually
makes it slower. Note - you don't need to do anything special to take
advantage of SMP if you use Twisted's recommended, non-threaded way of
parallelizing things, which is spawning multiple worker subprocesses.
So, Twisted can use exactly as much processing power as Python, especially
because
Twisted supports threads.
Twisted people think you should spawn a separate process for intensive
work. If you do this you need to synchronize resource access as you would
with threads. That is probably the most mentioned problem with
threads.
You don't need to synchronize resource access when you use subprocesses. You
can copy data to multiple subprocesses and serialize resource
access, without doing any extra work, since Twisted will deliver attempts at
resource access through normal I/O delivery channels, which are processed by
the main-loop. Also, you can run your subprocesses in isolation, without
concern for synchronized access to shared data structures. In the case which
he seems to be talking about, e.g. that of large shared memory objects which
need to be mutated and then operated upon by multiple cooperating processes,
you can still avoid locking by using a tuple-space model of interaction and
delivering work to subprocesses through pipes rather than
delivering data. This still only has one simple program managing
the interactions of many, taking advantage of the OS's much-vaunted
thread-switching abilities, and doesn't require mutexes on every
operation.
You also need to find an effective portable method of inter-process
communication. This is not much of a problem, but it is something you
wouldn't have to do with threads.
You certainly have to do this with threads. It might appear as
though you do not, and in some cases it may be slightly easier to implement,
but if you don't track which objects are in use by which threads, mutex
overhead will quickly cripple any performance gains that you'd see in a
threaded application. So your IPC mechanism with threads is queues, or
condition variables, rather than pipes or shared memory, but it's still
there and it still requires a lot of maintenance.
In conclusion, I stand by one of arensito's last claims:
When I approached Twisted people with questions about these results I
was told I was not worth listening to. Followers stated bluntly they were
smarter than me.
In fact, he isn't worth listening to, and I am proud to say that the Twisted
team is smarter than he is. More than that: he's worth not
listening to. The "benchmark" that he has proposed does not test anything
about Twisted, and does not test anything meaningful about his
hypothesis.
There's lots of work to be done on Twisted, and it certainly has its share
of performance problems. It's by no means the fastest system of its kind. I
am always excited to hear about ways it can be improved, but don't just make
up a bunch of lies, write a while loop, slap a graph on it, and claim you've
discovered something better.