Trusting Trust is a Problem for Formal Methods—and That Should Scare Us

In 1984, Ken Thompson gave a Turing award lecture. You would think it'd be about his systems programming accomplishments: Multics, UNIX, the B language, etc. Instead, he gave Reflections on Trusting Trust a now-classic lecture outlining a particularly devious class of software bug that can be, in its most insidious form, self-replicating, undetectable, and absolutely compromising. His critical takeaway? "You can't trust code that you did not totally create yourself."

Well, what's one to do, then? Is every single person required to write everyting from scratch all the time? Are our computer science students to write their programs by hand, and only graduate to the privelege of running them once they've built their own hardware? It certainly seems a stretch. But Thompson's security concerns are valid; he demonstrated a real bug that real software certainly could have, and that's worrisome, even if it's unlikely any actual system contains a bug. Still more worrisome: it points out that it's not actually possible to know what assumptions are being made when evaluating security. "The compiler implements its specification." Who, actually, are you trusting? Since virtually all modern compilers are self-hosting, or are written in another language whose compiler is self-hosting (and always, invariably, run on some hardware), the answer, very often, amounts to "every single person who has worked on every single piece of software and hardware since 'memory' was cycling electric currents in mercury." For, if any of these people at any point along the chain decided to inject a Trojan horse, and had sufficient intellect to implement one sufficiently sophisticated (who are you to say they aren't? You're the one depending on them long-dead, after all) that one can propagate through subsequent recompilations, system wipes, novel processor implementations, and so on, all the way to a CVE 10 in your Debian credential database today. Yeah, maybe they didn't. These people are some of my heroes too. But, shouldn't it bother us that we can't rule out such a ridiculous possibility, deductively, once and for all?

Now more than ever, as we're allegedly on the dawn of AGI, this should be front-of-mind. While the greats of the past were smart, we will meet a whole different kind of smart of our own making. Which makes an AGI's inability to pull this cleverness off much more suspect than the men of the past. Would you trust silicon designed by a machine? You may have to (and probably, if Intel would tell us, you already are). Worse still, would you trust that these unknown men of the past were in the magic Goldilocks zone, being too dumb to implement a sustaining Trojan, and simultaneously smart enough to not have accidentally implemented a self-replicating bug of a kind unknown that an AGI can exploit?

This is a problem people are thinking about. In the open-source world, the Bootstrappable project is intent on making software cleanly source-auditable. But, as Thompson tells us, this is insufficient. Even if you did, through great effort, conduct a full source audit of every step along this audit process, you're still trusting your processor, your display, your bootloader, your firmware, your disk controller, Southbridge, microcode, kernel, shell, and every piece of low-level hardware and software you need to make running the 357-byte bootstrap seed possible at all. Not to mention the entire external system you use to get the sources you're auditing onto the boot disk in the first place.

In academia and embedded programming, there's renewed emphasis on formal verificion of critical programs. Modern CPU manufacturers, in order to avoid a repeat of the FDIV debacle. Even whole kernels have been formally verified at this point. These prevent needing to "totally create yourself" the entire software stack—once proven, common wisdom goes, you don't have to audit or even maintain the program! It's Bug Free™! But, these verifications have the same problem: who watches the watcher? They all presume correctness of the proof-checker, which is usually a programming language, and usually compiles to C. And, of course, they all run on hardware whose correctness is anyone's guess.

We're taking great pains to build out a chain of trust, through formal methods and full-source bootstraps. However, we've neglected its first link. There's no choice: you have to develop your own, visually-auditable, correct hardware. From the ground up. Johnny-von-Neumann style.

Through the Heart of Every Man

Trusting Trust is a Problem for Formal Methods—and That Should Scare Us

Table of Contents

1. The Problem…

2. And a Solution