Let Me Tell You A Secret

In which I provide you with hundreds of dollars worth of software consulting, for free, in a single blog post.

I do consulting1 on software architecture, network protocol development, python software infrastructure, streamlined cloud deployment, and open source strategy, among other nerdy things. I enjoy solving challenging, complex technical problems or contributing to the open source commons. On the best jobs, I get to do both.

Today I would like to share with you a secret of the software technology consulting trade.

I should note that this secret is not specific to me. I have several colleagues who have also done software consulting and have reflected versions of this experience back to me.2

We’ll get to the secret itself in a moment, but first, some background.


Companies do not go looking for consulting when things are going great. This is particularly true when looking for high-level consulting on things like system architecture or strategy. Almost by definition, there’s a problem that I have been brought in to solve. Ideally, that problem is a technical challenge.

In the software industry, your team probably already has some software professionals with a variety of technical skills, and thus they know what to do with technical challenges. Which means that, as often as not, the problem is to do with people rather than technology, even it appears otherwise.

When you hire a staff-level professional like myself to address your software team’s general problems, that consultant will need to gather some information. If I am that consultant and I start to suspect that the purported technology problem that you’ve got is in fact a people problem, here is the secret technique that I am going to use:

I am going to go get a pen and a pad of paper, then schedule a 90-minute meeting with the most senior IC3 engineer that you have on your team. I will bring that pen and paper to the meeting. I will then ask one question:

What is fucked up about this place?

I will then write down their response in as much detail as I can manage. If I have begun to suspect that this meeting is necessary, 90 minutes is typically not enough time, and I will struggle to keep up. Even so, I will usually manage to capture the highlights.

One week later, I will schedule a meeting with executive leadership, and during that meeting, I will read back a very lightly edited4 version of the transcript of the previous meeting. This is then routinely praised as a keen strategic insight.


I should pause here to explicitly note that — obviously, I hope — this is not an oblique reference to any current or even recent client; if I’d had this meeting recently it would be pretty awkward to answer that “so, I read your blog…” email.5 But talking about clients in this way, no matter how obfuscated and vague the description, is always a bit professionally risky. So why risk it?

The thing is, I’m not a people manager. While I can do this kind of work, and I do not begrudge doing it if it is the thing that needs doing, I find it stressful and unfulfilling. I am a technology guy, not a people person. This is generally true of people who elect to go into technology consulting; we know where the management track is, and we didn’t pick it.

If you are going to hire me for my highly specialized technical expertise, I want you to get the maximum value out of it. I know my value; my rates are not low, and I do not want clients to come away with the sense that I only had a couple of “obvious” meetings.

So the intended audience for this piece is potential clients, leaders of teams (or organizations, or companies) who have a general technology problem and are wondering if they need a consultant with my skill-set to help them fix it. Before you decide that your issue is the need to implement a complex distributed system consensus algorithm, check if that is really what’s at issue. Talk to your ICs, and — taking care to make sure they understand that you want honest feedback and that they are safe to offer it — ask them what problems your organization has.

During this meeting it is important to only listen. Especially if you’re at a small company and you are regularly involved in the day-to-day operations, you might feel immediately defensive. Sit with that feeling, and process it later. Don’t unload your emotional state on an employee you have power over.6

“Only listening” doesn’t exclusively mean “don’t push back”. You also shouldn’t be committing to fixing anything. While the information you are gathering in these meetings is extremely valuable, and you should probably act on more of it than you will initially want to, your ICs won’t have the full picture. They really may not understand why certain priorities are set the way they are. You’ll need to take that as feedback for improving internal comms rather than “fixing” the perceived problem, and you certainly don’t want to make empty promises.

If you have these conversations directly, you can get something from it that no consultant can offer you: credibility. If you can actively listen, the conversation alone can improve morale. People like having their concerns heard. If, better still, you manage to make meaningful changes to address the concerns you’ve heard about, you can inspire true respect.

As a consultant, I’m going to be seen as some random guy wasting their time with a meeting. Even if you make the changes I recommend, it won’t resonate the same way as someone remembering that they personally told you what was wrong, and you took it seriously and fixed it.

Once you know what the problems are with your organization, and you’ve got solid technical understanding that you really do need that event-driven distributed systems consensus algorithm implemented using Twisted, I’m absolutely your guy. Feel free to get in touch.


  1. While I immensely value my patrons support and am eternally grateful for their support, at — as of this writing — less than $100 per month it doesn’t exactly pay the SF bay area cost-of-living bill. 

  2. When I reached out for feedback on a draft of this essay, every other consultant I showed it to said that something similar had happened to them within the last month, all with different clients in different sectors of the industry. I really cannot stress how common it is. 

  3. “individual contributor”, if this bit of jargon isn’t universal in your corner of the world; i.e.: “not a manager”. 

  4. Mostly, I need to remove a bunch of profanity, but sometimes I will also need to have another interview, usually with a more junior person on the team to confirm that I’m not relaying only a single person’s perspective. It is pretty rare that the top-of-mind problems are specific to one individual, though. 

  5. To the extent that this is about anything saliently recent, I am perhaps grumbling about how tech CEOs aren’t taking morale problems generated by the constant drumbeat of layoffs seriously enough

  6. I am not always in the role of a consultant. At various points in my career, I have also been a leader needing to sit in this particular chair, and believe me, I know it sucks. This would not be a common problem if there weren’t a common reason that leaders tend to avoid this kind of meeting. 

Sourceforge Update

Authenticate downloaded binaries from sourceforge a little more.

When I wrote my previous post about Sourceforge, things were looking pretty grim for the site; I (rightly, I think) slammed them for some pretty atrocious security practices.

I invited the SourceForge ops team to get in touch about it, and, to their credit, they did. Even better, they didn't ask for me to take down the article, or post some excuse; they said that they knew there were problems and they were working on a long-term plan to address them.

This week I received an update from said ops, saying:

We have converted many of our mirrors over to HTTPS and are actively working on the rest + gathering new ones. The converted ones happen to be our larger mirrors and are prioritized.

We have added support for HTTPS on the project web. New projects will automatically start using it. Old projects can switch over at their convenience as some of them may need to adjust it to properly work. More info here:

https://sourceforge.net/blog/introducing-https-for-project-websites/

Coincidentally, right after I received this email, I installed a macOS update, which means I needed to go back to Sourceforge to grab an update to my boot manager. This time, I didn't have to do any weird tricks to authenticate my download: the HTTPS project page took me to an HTTPS download page, which redirected me to an HTTPS mirror. Success!

(It sounds like there might still be some non-HTTPS mirrors in rotation right now, but I haven't seen one yet in my testing; for now, keep an eye out for that, just in case.)

If you host a project on Sourceforge, please go push that big "Switch to HTTPS" button. And thanks very much to the ops team at Sourceforge for taking these problems seriously and doing the hard work of upgrading their users' security.

Don’t Trust Sourceforge, Ever

Authenticate downloaded binaries from sourceforge. A little.

Update: please see my more recent post about updates in the interim.

If you use a computer and you use the Internet, chances are you’ll eventually find some software that, for whatever reason, is still hosted on Sourceforge. In case you’re not familiar with it, Sourceforge is a publicly-available malware vector that also sometimes contains useful open source binary downloads, especially for Windows.


In addition to injecting malware into their downloads (a practice they claim, hopefully truthfully, to have stopped), Sourceforge also presents an initial download page over HTTPS, then redirects the user to HTTP for the download itself, snatching defeat from the jaws of victory. This is fantastically irresponsible, especially for a site offering un-sandboxed binaries for download, especially in the era of Let’s Encrypt where getting a TLS certificate takes approximately thirty seconds and exactly zero dollars.

So: if you can possibly find your downloads anywhere else, go there.


But, rarely, you will find yourself at the mercy of whatever responsible stewards1 are still operating Sourceforge if you want to get access to some useful software. As it happens, there is a loophole that will let you authenticate the binaries that you download from them so you won’t be left vulnerable to an evil barista: their “file release system”, the thing you use to upload your projects, will allow you to download other projects as well.

To use it, first, make yourself a sourceforge account. You may need to create a dummy project as well. Sourceforge maintains an HTTPS-accessible list of key fingerprints for all the SSH servers that they operate, so you can verify the public key below.

Then you’ll need to connect to their upload server over SFTP, and go to the path /home/frs/project/<the project’s name>/.../ to get the file.

I have written a little Python script2 that automates the translation of a Sourceforge file-browser download URL, one that you can get if you right-click on a download in the “files” section of a project’s website, and runs the relevant scp command to retrieve the file for you. This isn’t on PyPI or anything, and I’m not putting any effort into polishing it further; the best possible outcome of this blog post is that it immediately stops being necessary.


  1. Are you one of those people? I would prefer to be lauding your legacy of decades of valuable contributions to the open source community instead of ridiculing your dangerous incompetence, but repeated bug reports and support emails have gone unanswered. Please get in touch so we can discuss this. 

  2. Code:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    #!/usr/bin/env python2
    
    import sys
    import os
    
    sfuri = sys.argv[1]
    
    # for example,
    # http://sourceforge.net/projects/refind/files/0.9.2/refind-bin-0.9.2.zip/download
    
    import re
    matched = re.match(
        r"https://sourceforge.net/projects/(.*)/files/(.*)/download",
        sfuri
    )
    
    if not matched:
        sys.stderr.write("Not a SourceForge download link.\n")
        sys.exit(1)
    
    project, path = matched.groups()
    
    sftppath = "/home/frs/project/{project}/{path}".format(project=project, path=path)
    
    def knows_about_web_sf_net():
        with open(
                os.path.expanduser("~/.ssh/known_hosts"), "rb"
        ) as read_known_hosts:
            data = read_known_hosts.read().split("\n")
            for line in data:
                if 'web.sourceforge.net' in line.split()[0]:
                    return True
        return False
    
    sfkey = """
    web.sourceforge.net ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEA2uifHZbNexw6cXbyg1JnzDitL5VhYs0E65Hk/tLAPmcmm5GuiGeUoI/B0eUSNFsbqzwgwrttjnzKMKiGLN5CWVmlN1IXGGAfLYsQwK6wAu7kYFzkqP4jcwc5Jr9UPRpJdYIK733tSEmzab4qc5Oq8izKQKIaxXNe7FgmL15HjSpatFt9w/ot/CHS78FUAr3j3RwekHCm/jhPeqhlMAgC+jUgNJbFt3DlhDaRMa0NYamVzmX8D47rtmBbEDU3ld6AezWBPUR5Lh7ODOwlfVI58NAf/aYNlmvl2TZiauBCTa7OPYSyXJnIPbQXg6YQlDknNCr0K769EjeIlAfY87Z4tw==
    """
    
    if not knows_about_web_sf_net():
        with open(
                os.path.expanduser("~/.ssh/known_hosts"), "ab"
        ) as append_known_hosts:
            append_known_hosts.write(sfkey)
    cmd = "scp web.sourceforge.net:{sftppath} .".format(sftppath=sftppath)
    print(cmd)
    os.system(cmd)
    

Far too many things can stop the BLOB

Local mutable filesystem usage is a scalability problem.

It occurs to me that the lack of a standard, well-supported, memory-efficient interface for BLOBs in multiple programming languages is one of the primary driving factors of poor scalability characteristics of open source SaaS applications.

Applications like Gitlab, Redmine, Trac, Wordpress, and so on, all need to store potentially large files (“attachments”). Frequently, they elect to store these attachments (at least by default) in a dedicated filesystem directory. This leads to a number of tricky concurrency issues, as the filesystem has different (and divorced) concurrency semantics from the backend database, and resides only on the individual API nodes, rather than in the shared namespace of the attached database.

Some databases do support writing to BLOBs like files. Postgres, SQLite, and Oracle do, although it seems MySQL lags behind in this area (although I’d love to be corrected on this front). But many higher-level API bindings for these databases don’t expose support for BLOBs in an efficient way.

Directly using the filesystem, as opposed to a backing service, breaks the “expected” scaling behavior of the front-end portion of a web application. Using an object store, like Cloud Files or S3, is a good option to achieve high scalability for public-facing applications, but that creates additional deployment complexity.

So, as both a plea to others and a note to myself: if you’re writing a database-backed application that needs to store some data, please consider making “store it in the database as BLOBs” an option. And if your particular database client library doesn’t support it, consider filing a bug.

Ungineering

Don’t use the word “engineering” to refer to the process of creating software.

Update 2021: While I still stand by many of the ideas expressed in this essay — particularly “software is made out of feelings” — my views have been significantly changed by two follow-ups. If you're interested in this topic, you should read them both; they’ll teach you more than this will.

The first, “Reverse Ungineering”, by LVH, was a direct response to my post, based on personal experience being trained as a civil engineer and working as a software engineer. Reverse Ungineering changed my opinion almost immediately, so I actually held the view expressed in the summary for a very short period of time after publishing.

The second, “Are We Really Engineers?”, by Hillel Wayne, is a small but comprehensive ethnographic study of people who have done both jobs. It’s extremely eye-opening, and made me realize just how much of my idea of “engineering” was derived from a mixture of fiction and popular culture, and not at all on any reality.

Both of these posts bring to bear informative facts based on direct personal experience, as opposed to my unsubstantiated hypothesizing. While I often still call myself a software “developer” or “author”, and I think that comparisons to fields like writing and research can also be illuminating, I do now call myself an engineer as well. The experience of writing this post and reading its rebuttals taught me an important lesson about not drawing conclusions from an imagined experience that some unfamiliar category of person — in this case, civil engineers — might have.

I am not an engineer.

I am a computer programmer. I am a software developer. I am a software author. I am a coder.

I program computers. I develop software. I write software. I code.

I’d prefer that you not refer to me as an engineer, but this is not an essay about how I’m going to heap scorn upon you if you do so. Sometimes, I myself slip and use the word “engineering” to refer to this activity that I perform. Sometimes I use the word “engineer” to refer to myself or my peers. It is, sadly, fairly conventional to refer to us as “engineers”, and avoiding this term in a context where it’s what everyone else uses is a constant challenge.

Nevertheless, I do not “engineer” software. Neither do you, because nobody has ever known enough about the process of creating software to “engineer” it.

According to dictionary.com, “engineering” is:

“the art or science of making practical application of the knowledge of pure sciences, as physics or chemistry, as in the construction of engines, bridges, buildings, mines, ships, and chemical plants.”

When writing software, we typically do not apply “knowledge of pure sciences”. Very little science is germane to the practical creation of software, and the places where it is relevant (firmware for hard disks, for example, or analytics for physical sensors) are highly rarified. The one thing that we might sometimes use called “science”, i.e. computer science, is a subdiscipline of mathematics, and not a science at all. Even computer science, though, is hardly ever brought to bear - if you’re a working programmer, what was the last project where you had to submit formal algorithmic analysis for any component of your system?

Wikipedia has a heaping helping of criticism of the terminology behind software engineering, but rather than focusing on that, let's see where Wikipedia tells us software engineering comes from in the first place:

The discipline of software engineering was created to address poor quality of software, get projects exceeding time and budget under control, and ensure that software is built systematically, rigorously, measurably, on time, on budget, and within specification. Engineering already addresses all these issues, hence the same principles used in engineering can be applied to software.

Most software projects fail; as of 2009, 44% are late, over budget, or out of specification, and an additional 24% are cancelled entirely. Only a third of projects succeed according to those criteria of being under budget, within specification, and complete.

What would that look like if another engineering discipline had that sort of hit rate? Consider civil engineering. Would you want to live in a city where almost a quarter of all the buildings were simply abandoned half-constructed, or fell down during construction? Where almost half of the buildings were missing floors, had rents in the millions of dollars, or both?

My point is not that the software industry is awful. It certainly can be, at times, but it’s not nearly as grim as the metaphor of civil engineering might suggest. Consider this: despite the statistics above, is using a computer today really like wandering through a crumbling city where a collapsing building might kill you at any moment? No! The social and economic costs of these “failures” is far lower than most process consultants would have you believe. In fact, the cause of many such “failures” is a clumsy, ham-fisted attempt to apply engineering-style budgetary and schedule constraints to a process that looks nothing whatsoever like engineering. I have to use scare quotes around “failure” because many of these projects classified as failed have actually delivered significant value. For example, if the initial specification for a project is overambitious due to lack of information about the difficulty of the tasks involved, for example – an extremely common problem at the beginning of a software project – that would still be a failure according to the metric of “within specification”, but it’s a problem with the specification and not the software.

Certain missteps notwithstanding, most of the progress in software development process improvement in the last couple of decades has been in acknowledging that it can’t really be planned very far in advance. Software vendors now have to constantly present works in progress to their customers, because the longer they go without doing that there is an increasing risk that the software will not meet the somewhat arbitrary goals for being “finished”, and may never be presented to customers at all.

The idea that we should not call ourselves “engineers” is not a new one. It is a minority view, but I’m in good company in that minority. Edsger W. Dijkstra points out that software presents what he calls “radical novelty” - it is too different from all the other types of things that have come before to try to construct it by analogy to those things.

One of the ways in which writing software is different from engineering is the matter of raw materials. Skyscrapers and bridges are made of steel and concrete, but software is made out of feelings. Physical construction projects can be made predictable because the part where creative people are creating the designs - the part of that process most analagous to software - is a small fraction of the time required to create the artifact itself.

Therefore, in order to create software you have to have an “engineering” process that puts its focus primarily upon the psychological issue of making your raw materials - the brains inside the human beings you have acquired for the purpose of software manufacturing - happy, so that they may be efficiently utilized. This is not a common feature of other engineering disciplines.

The process of managing the author’s feelings is a lot more like what an editor does when “constructing” a novel than what a foreperson does when constructing a bridge. In my mind, that is what we should be studying, and modeling, when trying to construct large and complex software systems.

Consequently, not only am I not an engineer, I do not aspire to be an engineer, either. I do not think that it is worthwhile to aspire to the standards of another entirely disparate profession.

This doesn’t mean we shouldn’t measure things, or have quality standards, or try to agree on best practices. We should, by all means, have these things, but we authors of software should construct them in ways that make sense for the specific details of the software development process.

While we are on the subject of things that we are not, I’m also not a maker. I don’t make things. We don’t talk about “building” novels, or “constructing” music, nor should we talk about “building” and “assembling” software. I like software specifically because of all the ways in which it is not like “making” stuff. Making stuff is messy, and hard, and involves making lots of mistakes.

I love how software is ethereal, and mistakes are cheap and reversible, and I don’t have any desire to make it more physical and permanent. When I hear other developers use this language to talk about software, it makes me think that they envy something about physical stuff, and wish that they were doing some kind of construction or factory-design project instead of making an application.

The way we use language affects the way we think. When we use terms like “engineer” and “builder” to describe ourselves as creators, developers, maintainers, and writers of software, we are defining our role by analogy and in reference to other, dissimilar fields.

Right now, I think I prefer the term “developer”, since the verb develop captures both the incremental creation and ongoing maintenance of software, which is so much a part of any long-term work in the field. The only disadvantage of this term seems to be that people occasionally think I do something with apartment buildings, so I am careful to always put the word “software” first.

If you work on software, whichever particular phrasing you prefer, pick one that really calls to mind what software means to you, and don’t get stuck in a tedious metaphor about building bridges or cars or factories or whatever.

To paraphrase a wise man:

I am developer, and so can you.