In May Luciano Bello discovered a flaw in the Debian version of Linux which, as described in CVE-2008-0166, yielded "a random number generator that generates predictable numbers, which makes it easier for remote attackers to conduct brute force guessing attacks against cryptographic keys". While this sentence is unlikely to win any awards for clarity, the graveness of the situation should nonetheless be apparent.
Since September 2006 the Debian Etch distribution has been producing weak (public) keys for well-known protocols such as SSL and SSH, as well as certificates for personal encryption and signing. A few days after the announcement SANS raised its INFOCon level to yellow, indicating an increased threat against the general internet infrastructure (the INFOCon level has since returned to green). While the offending Debian code is easily patched (only two lines need to be uncommented) it will take much longer to assess the impact of over 18 months of weak keys being handed out by Debian.
The Best Laid Plans
The root cause of the incident centres around memory allocation. Programs frequently request additional memory when they are running. The operating system (OS) satisfies such requests by finding an available block of memory, marking the block as allocated, and returning the starting address of the block to the requesting program for access.
The memory allocated to a program request will not be empty - the block will always contain some values in terms of zeros and ones. These values are typically what was left behind by the last program that used the allocated locations.
When a program is allocated memory it is good programming style for the programmer to assign known values to the locations so that subsequent computations involving these locations begin from a known state. For example, if one of the allocated locations was to be used for a variable hits to count the number of visitors to your web site, then hits should be initialised to zero. If the last value assigned to the memory location of hits was 192 or -517, then simply incrementing hits for each new web site visitor will not achieve the desired result.
Code scanning tools are able to detect instances of memory that are allocated to a program but are not initialised before being used. The scanning tools will return warnings and advise that initial values be assigned. In general this is good advice unless the programmer had an unconventional reason for using uninitialised memory.
But the programmers of the OpenSSL PRNG apparently had such a reason. They actually wanted to have memory allocated for the seed to the PRNG and use the previous values contained in these locations as part of the seed. Later, another Debian programmer seeing that code scanning tools were complaining about uninitialised memory in the OpenSSL PRNG took the well-intentioned step of commenting out the offending code, removing the code from the resulting executable program.
The offending code consisted of changes to just two lines.
Commenting out these two lines (unfortunately) did not stop the PRNG from working but had the effect of reducing its seed to be just the process identifier (PID) for the program. The PID is a value used by the OS to distinguish one process from another for administration purposes. The PID is typically a 15-bit value that is able to represent numbers in the range 0 to 32,767, or less than 33,000 distinct values in total. Cryptographic keys that were then generated from such a seeded PRNG are selected from only a small number of possible values and can therefore be easily broken. Instead of keys being based on hundreds or thousands of bits of entropy, keys were based on at most 15 bits of entropy.
So in summary: the seed generation process for OpenSSL PRNG was designed to rely on using previously assigned values to dynamically allocated memory. This unusual programming practice was not well-communicated, and a subsequent Debian programmer removed the two lines of code that seeded the PRNG through memory allocation. The effect was to produce a PRNG that was only able to produce 33,000 different values, which leads to a predictably small number of cryptographic keys.
The impact of changing two lines of code
The scope of the impact of Debian creating weak keys is multidimensional: the time scales, other code distributions, security protocols, security credentials and finally data. With respect to time, the PRNG programming flaw has been present in Debian since September 2006 (the Etch release), and was not publicly detected until May this year, meaning that weak keys have been generated on the Debian distribution for over 18 months now. Further, other Linux distributions derived from Debian, including versions of Knoppix and Ubuntu, have also inherited the flaw over this time period.
Security protocols whose underlying cryptography was weakened include VPN, SSH, SSL and its derivatives such as FTPS, as well as all X.509 key material (general public/private key pairs). The full list of impacted security protocols and applications is quite extensive. Further, internet-facing servers with weak SSL certificates bound to a domain name, such as www.mycompany.com, will need to be revoked and reissued by the company administering that domain. An article in the Register, called Debian's Epic SSL Blunder, states that the number of SSL certificates that may need replacing could be in the hundreds of thousands or even millions. So while the OpenSSL PRNG code can be easily patched identifying and replacing all the weak keys generated by the flawed code is a big operational headache. It may be months or years before all the weak keys and their corresponding certificates are tracked down and upgraded.
Another serious threat is the potential exposure of data, including security credentials such as passwords, which were transported over public networks under the false assumption that they were protected by strong cryptography. It has already been suggested in a Heise Security article that secret service organisations have been exploiting the weaknesses in communication.
A Blackish Swan?
According to Wikipedia, a black swan "is a large-impact, hard-to-predict, and rare event beyond the realm of normal expectations". Are we dealing with a Debian Black Swan? It is too early to judge this incident as an outright Black Swan, or the catalyst for an imminent Black Swan, since the consequences have not fully played out. Certainly the incident has highlighted potential weaknesses in the open source programming model, if not in general, then at least in the interaction between open source developers and vendor developers. Ben Laurie of OpenSSL originally blasted Debian for changing code that they "did not understand", but in a subsequent post his tone was more conciliatory as evidence of the change being discussed on OpenSSL mailing lists was presented. His final remark was "I welcome any suggestions to improve this situation".
Flaws related to encryption always make good copy, and on occasion, strike at the heart of our fundamental beliefs in security. When encryption falters the whole edifice of security seems shaken.
You can find the research used to produce this post as a FreeMind mindmap rendered into Flash here.