Tuesday, August 19, 2008

Quantum Computing: are you Shor?

qc1 Quantum computing ringing the imminent death knell of encryption is a familiar refrain in the popular press. The end of a dynasty is close at hand. A recent article in IT News is typical when it states that advances in quantum computing are likely to "render[ing] digital encryption techniques all but obsolete". Not quite.

Quantum computers are very different from the ones we have on our desktops now, usually referred to as classical computing devices. A potential misconception is to think that quantum computers will teleport us to a future where the benefits of Moore's Law compounded over several centuries can be reaped. Intel's leading chip line, the x86, has steadily progressed from 286, 386, 486, Pentium and so on, but quantum computers will not be "1000-86" devices - unimaginably faster versions of what we have today.

Quantum computers are built from different computational primitives than our present devices. It is difficult to formally explain these differences but the main idea is that quantum computers can operate in many states simultaneously whereas classical computers are built from (bipolar) components that are interpreted to operate in one of two states - 0 or 1, on or off.

The massive type of parallelism furnished by quantum logic cannot as yet be used to solve general computational problems. A specific quantum algorithm must be tailored for each problem class of interest - such algorithms are more a bespoke than off-the-shelf technology. The quantum algorithm that has received the most attention with respect to security was devised by Peter Shor, and proves that factoring is efficient on a quantum computer. When you read that current encryption methods will be rendered useless by quantum computation, the authors are usually referring to Shor's algorithm.

Shor's Algorithm

shor How does Shor's algorithm work? Well it uses a quantum algorithm to solve a particular equation whose solution is likely to factor a number. For the last 30 years or so, most modern factoring algorithms have been converging on a single idea, which is the following. Say that N = P*Q is the number that you wish to factor. Assume that you can find two numbers X and Y such that

X^2 = Y^2 (mod N)

which means that the squares of X and Y leave the same remainder when divided by N. It then follows that

X^2 - Y^2 = 0 (mod N)

(X - Y)(X + Y) = 0 (mod N)

so that (X - Y)(X + Y) divides N. It can be shown that there is at least a 50% chance that either (X - Y) or (X + Y) will be divisible by P or Q, and therefore share a nontrivial factor with N. If this is the case, then N can be factored by computing gcd(X - Y, N) and gcd(X + Y, N).

 

Let's call X and Y a magic pair, and if we can generate enough magic pairs for a given N then we can factor it (think of flipping a coin and waiting for it to turn up heads). So given that factoring is generally believed to be difficult then generating magic pairs must be difficult as well. Most modern factoring algorithms are in fact clever ways to generate magic pairs. While these methods are much faster than brute force approaches, they are far from being efficient.

The breakthrough from Shor was to devise an efficient quantum method for generating magic pairs, which therefore also makes factoring efficient. He did this by reducing the problem of finding magic pairs to what is called computing orders.

Let Z be a number such that 1 < Z < N, and gcd(N, X) = 1 (that is, a number less than N and relatively prime to N). When these conditions hold, there is a smallest number K < N such that

Z^K = 1 (mod N)

This value of K is called the order of Z (modulo N). If K is even, say equal to 2*D, then we have that

Z^(2*D)  =  (Z^D)^2  =  1 (mod N).

But we can take the square root of both sides of the equation

(Z^D)^2 = 1^2 (mod N)

to yield a magic pair by setting X = Z^D and Y = 1. So finding a number with even order is the same as finding a magic pair.

Shor invented a brilliant quantum algorithm for efficiently computing orders (the details of which are to complicated to reproduce or even summarise here). Then to factor a number N, just generate a random number Z and quantum compute its order. If the order is even, then create a magic pair and try to factor N; if N does not factor, or the order is not even, then choose another Z and repeat. You wont have to do too many iterations before N will be factored with high probability.

So it is not so much that factoring can be solved easily on a quantum computer but rather that computing orders is easily solved on a quantum computer and factoring can be reduced to computing orders. Any cryptosystem whose security can be reduced to computing orders will also be insecure. Unfortunately this is the case for most well-known public key cryptosystems, including RSA, discrete logarithm based system like Diffie-Hellman and DSA, as well as elliptic curve systems. It is easy to see how Shor's Algorithm quickly becomes a the-sky-is-soon-to-be-falling story line in the press.

However, there are public key systems that are not impacted by Shor's algorithm. One such scheme is the McEliece cryptosystem whose security is derived from intractable problems in coding theory, which has nothing to do with computing orders. The McEliece cryptosystem is not commonly deployed but perhaps its day will come.

Quantum Speculation 

We remarked at the beginning of this post that quantum computing is not a technology for cheating Moore's Law - quantum computers are not general devices that will bring with them exponential improvements over what we have today. We need to craft individual quantum algorithms for each class of problem that we want to solve rapidly on a quantum platform. Computing orders is one such general problem whose quantum solution has dire implications for most of today's common public key systems.

The quantum future that brings the end to the RSA dynasty is quite a few  decades away, if not more. Your great grandchildren's RSA keys may be under threat, if we are still using RSA then. You may also have some reason for concern if you are encrypting or signing data with public keys that you are assuming will be secure for the next 50 years or so.

Today from a risk perspective quantum computers are a low-probability high-impact event - a coming catastrophe for many extant public key systems. In 100 years quantum computers are likely to be a high-probability high-impact event - that is, a commonplace tool being used for better or worse.

Our RSA keys are safe for some time yet, until quantum computation matures, and it may even turn out that large scale implementations with many quantum logic gates are difficult to engineer. Certain public key systems will survive even sophisticated quantum computer deployments in any case, unless as yet undiscovered specific quantum algorithms can be tailored to these systems as well. One can speculate endlessly.

Friday, August 8, 2008

Long Tail of Vulnerability for A5/1

Stream ciphers are a special class of cipher, often used for fast encryption of data streams such as dedicated network links or fax lines. Stream ciphers produce a key stream that is simply added (XOR-ed) with the data stream to produce ciphertext, and as such can achieve high encryption rates especially if implemented in hardware.

Below is a diagram of the A5/1 stream cipher, invented in 1987 to protect GSM voice and data traffic. A5/1 loads a 64-bit secret key into its 3 shift registers, which are then clocked and combined with a frame vector to produce an initial state. A5/1 is then clocked as required to produce keystream bits to encrypt the next 114 bit GSM frame.

For over 10 years theoretical attacks on A5/1 have been postulated that lower the cost of attack significantly over searching the 64-bit key space directly. Amongst cryptographers and mobile security people, A5/1 is considered to be insecure and its 64-bit key size woefully inadequate by today's standards.

In February David Hulton, director of applications for the high-performance computing company Pico, and Steve Muller, a researcher for mobile security firm CellCrypt, announced new results which claim that A5/1 keys can be recovered in 30 minutes for $1000. The attack is both passive (no network injection or basestation masquerading) and can be executed remotely (no proximity to the target device required). Further, Hulton & Muller (H&M) are patenting their optimizations and intend to sell an A5/1 key recovery tool commercially. Cell call snooping may soon be affordable by many groups and people, not just well-resourced intelligence agencies.

The attack is based on pre-computing rainbow tables that enable the direct "lookup" of A5/1 keys when indexed by a small amount of observed traffic (3 to 4 GSM frames). While it remains impractical to create a rainbow table that has one entry for each each of the possible 2^(64) A5/1 keys, H&M have devised a method that only requires a table of size 2^(58). Dedicated FPGA devices are currently being deployed to speed up the required rainbow table computations, and are expected to be completed by the end of March. By way of comparison, it is estimated that attempting the same computation on a lone PC would take 33 thousand years.

The resulting rainbow tables will be several terabytes in size, and will be made available to the public. H&M also intend to sell a fast cracking version of the software that can break GSM encryption in just 30 seconds, charging between $200,000 and $500,000 for this privilege. Perhaps they will also offer service of recovering of GSM keys based on submitted traffic samples, giving rise to a instance of cryptanalysis-as-a-service (CaaS).

While researchers and security professionals view the H&M approach as cheaper and more efficient than previous attacks, the attack itself comes as no surprise given the known weaknesses of A5/1. The practicality of the H&M approach should be the final piece of evidence to convince operators to move off A5/1 as a basis for secure and private for GSM communication. David Pringle, a spokesperson for the GSM Association (GSMA), has declined to comment on the H&M attack until further details are known.

A5/1 has operated unchanged for the last 21 years but it has now reached its cryptographic end-of-life, engulfed by the march of Moore's Law. However, the operational end-of-life of A5/1 may still be decades away as there are approximately 2 billion GSM subscribers, commanding about 80% of the global mobile market. This would be a tough product recall indeed. A5/1 is well-positioned to become the NT of the mobile crypto world, and I see the makings of a long tail of GSM vulnerability.

You can find the research used to produce this post as a FreeMind mindmap rendered into Flash here.


Related Posts

Tuesday, July 22, 2008

The Cold Boot Attack

In February, a team led by researchers at Princeton announced how to recover laptop encryption keys directly from the DRAM (dynamic RAM) of the device. This attack, dubbed "Cold Boot", began a flurry of posts and articles stating dismay, seeking clarification and listing defences. The contents of DRAM takes several minutes to be cleared after power is shut off, and an attacker with physical access to the machine has an opportunity to recover the contents of memory over those few minutes. Contrary to popular belief, dynamic RAM is only somewhat dynamic.

Reproduced below are a series of images from the Princeton team depicting the DRAM decay of a Mona Lisa image over a period of 5 minutes. The striped image on the far right shows the ground states for the DRAM, which are the 0 or 1 states that the memory elements will eventually return to after power is removed. By the time DRAM reaches the striped ground state all information on encryption keys (and any other parameters) in memory is lost.

Picture1

A common reaction from experienced security professionals was to state that Cold Boot attacks are well-known to exist in principle, and that the Princeton work has filled in the practical details. “[The cold boot attack has] been in the toolbox of forensics examiners for some time,” said Murugiah Souppaya, a researcher at the National Institute of Standards and Technology. The attack is more of a wake-up call than a bolt of lightening.

Hard-disk encryption vendors were also quick to point out that Cold Boot attacks exploit hardware vulnerabilities and not encryption weaknesses. The true risk comes from software that relies on power loss to purge keys from memory rather than the software explicitly clearing memory itself. The lesson for users is to ensure that power is shut off to DRAM when they not are using their laptops (either by shutting down the laptop or by putting it into hibernate mode).

When the keys bit are extracted from memory some will already have false values due to ground state decay. Pictured left is a decayed Mona Lisa image which contains sufficient information to recognize the original image. The Princeton researchers have devised algorithms for detecting and recovering key bits corrupted by ground state decay, which, after stripping away all the fanfare of the attack, is the most significant contribution of the work. How do we recover the true key from a corrupted copy?

Coding theory is the study of exactly this problem, in the context of sending and receiving messages where messages bits can be corrupted (flipped) during transmission. Additional information, known as redundancy, must be added to each message so that errors can be detected and corrected. Without some redundancy there is little hope to find a handle on what errors have occurred.

Luckily the keys do in fact come with their own redundancy in DRAM. A user-supplied key to a block cipher is converted into an internal format that matches the operation of the cipher. For example, in the case of DES the user-supplied 56-bit key is converted into an 768-bit extended key, corresponding to the 48-bit sub-keys used in each of the 16 rounds in DES. Now the sub-keys themselves consist of bits that are just copies of bits from the user-supplied key. Each bit of the user-supplied key is guaranteed to occur at least 14 times in the extended key.

For efficiency reasons, the extended key is stored in DRAM rather than the user-supplied key. So for DES, and its triple-DES variants, each key bit will be stored at least 14 times in DRAM. Thus there will be (at least) 14 copies of each key bit to work with, which we can think of as 13 bits of redundancy for each key bit. PGP CTO, John Callas, likened this key decoding process to completing the remaining squares of a Sudoku game. While the analogy is not perfect, it is suggestive of the type of work that must be performed to decode partially decayed keys from memory.

For me, the Cold Boot incident is reminiscent of Richard Feynman's involvement in the 1986 Challenger disaster inquiry. There was long debate on whether the O-rings would or would not expand under cold conditions, and Feynman demonstrably settled the question by placing a clamped O-ring into a glass of ice water and showing that it did not expand when removed. At the time Freeman Dyson noted that this was an example of Nature providing a simple answer when asked a simple question.

The Princeton team asked Nature the simple question of whether DRAM is cleared on power loss, and the simple answer is no.

You can find the research used to produce this post as a FreeMind mindmap rendered into Flash here.


Sunday, July 20, 2008

Are AES 256-bit keys too large?

ice This may seem an odd question given that since the mid 70's discussions about cryptographic keys have been mainly concerned about their potential shortness. While the risks of using short keys are apparent, there are risks lurking at the other end of the gamut as well. We can have too much of a good thing, and also think we have a good thing when we in fact don't.

The adoption of the AES cipher has progressed quite rapidly, greatly assisted by being mandated as a US government standard. As a technology AES has nonchalantly skipped along the Gartner hype cycle and nestled into the Plateau of Productivity. AES supports key lengths of 128, 192 and 256 bits, and many commercial offerings, to encrypt laptops or USB sticks for example, supply AES at the maximum 256-bit key length. Will attackers be able to bypass the high level of security implied by AES-256? In at least two significant cases, the answer is likely to be yes. But we require some background first.

The economics of AES keys

Modern block ciphers encrypt by mixing data over a series of rounds to produce a complex nonlinear relationship between the plaintext (message), ciphertext and key. For AES-128 , AES-192 and AES-256 the number of rounds is 10, 12 and 14 respectively. So AES-256 requires a further 4 rounds over AES-128 to mix the additional key bits, and is consequently only 40% slower. Thus a 128-bit key can be doubled (squaring the effort of exhaustive attack) for a relatively modest cost in performance. Therefore, unless performance is critical, maximal length 256-bit keys are a natural choice for AES deployments.

haystack So given that AES-256 is likely to be deployed widely, what type of security can we expect? Well, when a secret key cipher has a key length of n bits this is interpreted to mean that, in the absence of any weaknesses, the security of the cipher is proportional to the cost of a brute force attack on a key space of size 2^n. That is, a needle-in-a-haystack search problem where there are of 2^n pieces of hay. On average the correct key will be recovered after 2^(n-1) guesses, or searching half the key space. So AES-256 should be secure against an attacker who has materially less than 2^(255) = 10^(76) resources at their disposal. Well, unless you have a need to approximate the number of hydrogen atoms in the observable universe, you will never encounter a number as large as 10^(76). So there is little hope in attacking AES-256 by direct brute force methods, but is this the only way to get at the data protected by such large keys?

AES and Passwords

It is common for laptop and USB encryption solutions to use one of the following key management systems:

  • Derive the device-encryption key directly from a user-supplied password, using a method such as the Password-Based Key Derivation Function.
  • Randomly generate the device-encryption key, and then encrypt this key by a key derived from a user-supplied password

doll The latter approach is preferable since it actually protects data with a full length key, and further does not require data re-encryption when the user changes their password. It is possible to create more intricate chains of key encrypting keys, similar to those encapsulating Russian dolls, but ultimately the security of the device-encryption key reduces to the strength of user passwords.

How large a password space do we need to provide the same security as a 256-bit key space? Assuming that the password consists of characters drawn from the set of 94 printable ASCII keyboard characters, the password would have to be 40 characters in length. That is, a user would have to select a random 40 character password to achieve the same security against a brute force guessing attack on an AES-256 key. Such passwords would probably take about 20 seconds to be entered by a typical person.

In practice we can expect user passwords to be less than 15 characters in length, and certainly not chosen uniformly. So we have an example of strong cryptography being bypassed via far weaker passwords. Some laptop encryption vendors now offer two-factor authentication to protect their keys, but such solutions are not widely deployed as yet.

AES in Security Protocols

Another place where we need to re-evaluate the security of AES-256 is in protocols like S/MIME, SSL and TLS where data is protected by a bulk encryption method, but the key of the bulk encryption method is protected by a public key. For example, TLS now supports AES-256 as a ciphersuite, in particular, in combination with RSA for key establishment. Since the AES-256 key is derived deterministically from the keying material protected by RSA encryption, then the strength of the RSA key should be proportional to the strength of the AES-256. But how large does an RSA key need to be so that it provides similar security to an AES-256 key?

In 2001 this problem was considered by the well-known and respected researcher Arjen Lenstra and his results indicate that the equivalent RSA key length to match AES-256 security would be more than 13,000 bits! Since RSA is commonly deployed in the range of 1024 to 4096 bits, we therefore have a huge discrepancy as compared to AES-256 security. The size of RSA keys can in theory be increased indefinitely, but computational efficiency becomes the limiting factor. For RSA and discrete logarithm system that use n-bit keys, the computational cost of using these systems (particularly for decryption) is proportional to n^3 operations. So if the key size is doubled from n bits to 2n bits, then performance will decrease by a factor of approximately 8. Increasing a 1024 bit key 13-fold will reduce performance by a factor of over 2000.

For protocols such as TLS and S/MIME, the security of AES-256 will be bypassed by attacking the public key system protecting the AES keys. In this case, without consideration for other contextual security parameters, AES-256 provides a false sense of security.

Length is not always Strength

AES 256-bit keys will certainly thwart direct brute force attacks far beyond our lifetimes. Adi Shamir, one of the co-inventors of RSA, has stated that "cryptography is typically bypassed, not penetrated". In this post we have considered two cases where this can happen: long keys being protected by weaker passwords, and long keys being protected by much shorter (weaker) keys. The moral seems clear: don't use very long keys in a given environment unless the security of the other parameters involved is similar, since the weakest link will be the target. Unfortunately you just won't find any other practical security parameter up to par with an AES-256 key.

Related Posts

Sunday, July 13, 2008

A Blackish Swan for Debian Crypto

debian In May Luciano Bello discovered a flaw in the Debian version of Linux which, as described in CVE-2008-0166, yielded "a random number generator that generates predictable numbers, which makes it easier for remote attackers to conduct brute force guessing attacks against cryptographic keys". While this sentence is unlikely to win any awards for clarity, the graveness of the situation should nonetheless be apparent.

Since September 2006 the Debian Etch distribution has been producing weak (public) keys for well-known protocols such as SSL and SSH, as well as certificates for personal encryption and signing. A few days after the announcement SANS raised its INFOCon level to yellow, indicating an increased threat against the general internet infrastructure (the INFOCon level has since returned to green). While the offending Debian code is easily patched (only two lines need to be uncommented) it will take much longer to assess the impact of over 18 months of weak keys being handed out by Debian.

The Best Laid Plans

The root cause of the incident centres around memory allocation. Programs frequently request additional memory when they are running. The operating system (OS) satisfies such requests by finding an available block of memory, marking the block as allocated, and returning the starting address of the block to the requesting program for access.

The memory allocated to a program request will not be empty - the block will always contain some values in terms of zeros and ones. These values are typically what was left behind by the last program that used the allocated locations.

When a program is allocated memory it is good programming style for the programmer to assign known values to the locations so that subsequent computations involving these locations begin from a known state. For example, if one of the allocated locations was to be used for a variable hits to count the number of visitors to your web site, then hits should be initialised to zero. If the last value assigned to the memory location of hits was 192 or -517, then simply incrementing hits for each new web site visitor will not achieve the desired result.

Code scanning tools are able to detect instances of memory that are allocated to a program but are not initialised before being used. The scanning tools will return warnings and advise that initial values be assigned. In general this is good advice unless the programmer had an unconventional reason for using uninitialised memory.

doh But the programmers of the OpenSSL PRNG apparently had such a reason. They actually wanted to have memory allocated for the seed to the PRNG and use the previous values contained in these locations as part of the seed. Later, another Debian programmer seeing that code scanning tools were complaining about uninitialised memory in the OpenSSL PRNG took the well-intentioned step of commenting out the offending code, removing the code from the resulting executable program.

The offending code consisted of changes to just two lines.

Commenting out these two lines (unfortunately) did not stop the PRNG from working but had the effect of reducing its seed to be just the process identifier (PID) for the program. The PID is a value used by the OS to distinguish one process from another for administration purposes. The PID is typically a 15-bit value that is able to represent numbers in the range 0 to 32,767, or less than 33,000 distinct values in total. Cryptographic keys that were then generated from such a seeded PRNG are selected from only a small number of possible values and can therefore be easily broken. Instead of keys being based on hundreds or thousands of bits of entropy, keys were based on at most 15 bits of entropy.

So in summary: the seed generation process for OpenSSL PRNG was designed to rely on using previously assigned values to dynamically allocated memory. This unusual programming practice was not well-communicated, and a subsequent Debian programmer removed the two lines of code that seeded the PRNG through memory allocation. The effect was to produce a PRNG that was only able to produce 33,000 different values, which leads to a predictably small number of cryptographic keys.

The impact of changing two lines of code

dominos

The scope of the impact of Debian creating weak keys is multidimensional: the time scales, other code distributions, security protocols, security credentials and finally data. With respect to time, the PRNG programming flaw has been present in Debian since September 2006 (the Etch release), and was not publicly detected until May this year, meaning that weak keys have been generated on the Debian distribution for over 18 months now. Further, other Linux distributions derived from Debian, including versions of Knoppix and Ubuntu, have also inherited the flaw over this time period.

Security protocols whose underlying cryptography was weakened include VPN, SSH, SSL and its derivatives such as FTPS, as well as all X.509 key material (general public/private key pairs). The full list of impacted security protocols and applications is quite extensive. Further, internet-facing servers with weak SSL certificates bound to a domain name, such as www.mycompany.com, will need to be revoked and reissued by the company administering that domain. An article in the Register, called Debian's Epic SSL Blunder, states that the number of SSL certificates that may need replacing could be in the hundreds of thousands or even millions. So while the OpenSSL PRNG code can be easily patched identifying and replacing all the weak keys generated by the flawed code is a big operational headache. It may be months or years before all the weak keys and their corresponding certificates are tracked down and upgraded.

Another serious threat is the potential exposure of data, including security credentials such as passwords, which were transported over public networks under the false assumption that they were protected by strong cryptography. It has already been suggested in a Heise Security article that secret service organisations have been exploiting the weaknesses in communication.

A Blackish Swan?

Black Swan According to Wikipedia, a black swan "is a large-impact, hard-to-predict, and rare event beyond the realm of normal expectations". Are we dealing with a Debian Black Swan? It is too early to judge this incident as an outright Black Swan, or the catalyst for an imminent Black Swan, since the consequences have not fully played out. Certainly the incident has highlighted potential weaknesses in the open source programming model, if not in general, then at least in the interaction between open source developers and vendor developers. Ben Laurie of OpenSSL originally blasted Debian for changing code that they "did not understand", but in a subsequent post his tone was more conciliatory as evidence of the change being discussed on OpenSSL mailing lists was presented. His final remark was "I welcome any suggestions to improve this situation".

Flaws related to encryption always make good copy, and on occasion, strike at the heart of our fundamental beliefs in security. When encryption falters the whole edifice of security seems shaken.

You can find the research used to produce this post as a FreeMind mindmap rendered into Flash here.


Sunday, June 22, 2008

RSA is 30, and counting

rsa-photo

This year is the 30th anniversary of the journal publication for the RSA cryptosystem:

R. Rivest, A. Shamir, L. Adleman. A Method for Obtaining Digital Signatures and Public-Key Cryptosystems. Communications of the ACM, Vol. 21 (2), pp.120–126. 1978.

This is one of the most cited papers in all of computer science. The authors are pictured above around the time their RSA paper was published (Shamir, Rivest, Adleman - left to right). I learned about the mathematics of RSA in 1985 as part of a cryptography course based on Dorothy Denning's excellent introductory text Cryptography and Data Security. Over the years quite a few security professionals have confided in me that they do not really understand the basic mathematics of how RSA works. In this post I hope that I can tell the RSA story one more time, in a way that makes it easier to understand.

Background

Let E be a generic encryption operation, and let D be its corresponding decryption operation. For any message M, a basic property of any cryptosystem is that

M = D(E(M))

or in words, that decrypting an encrypted message yields the original message. We can then think of D(E( )) as a self-mapping, taking messages M to themselves under the action of some key. To explain RSA we are going to look at several well-known self-mapping functions from number theory, and then show how we can decompose these self-maps into two parts, which can be thought of as encryption and decryption.

All references below to numbers will mean non-negative integers: 0, 1, 2, 3 and so on. Number theory is mainly concerned with these numbers and their properties. Also, we will use the terms number and message interchangeably, since any message has a binary representation that can be interpreted as a number (see the standard PKCS #1 on a specific method for doing this).

Modular Exponentiation

Our source of self-mappings will be derived from modular exponentiation, the fundamental operation in many public key systems including RSA, DSA, Diffie-Hellman, and El Gamal systems.

We start with the simpler notion of exponentiation. Exponentiation is the operation of raising a base M to an exponent e, written as M^e, which simply means to multiply M by itself e times. When e = 2 exponentiation is called squaring, and called cubing when e = 3. For larger values of e we also describe M^e as "raising M to the e-th power".

There are two rules which can be used to simplify exponentiations, both of which are used in RSA: the product rule

M^(e*d) = (M^e)^d

and the sum rule

M^(e + d) = (M^e) * (M^d).

In modular exponentiation we are given an additional parameter m and we need to compute M^e mod m. For any value x , x mod m means to return the remainder of dividing x by m. So 17 mod 5 = 2 since 17 = 3*5 + 2, or in words, 5 goes into 17 three times with 2 left over (the remainder). Here m is called the modulus. So M^e mod m means to first compute M^e and then return the remainder upon dividing by m.

Flawed RSA v1

After that introduction, we need one simple result to take us forward, called Fermat's Little Theorem: let p be a prime and let M be any value in the range 0 < M < p, then

M^(p - 1) = 1 mod p

Since M^p = M^(p-1) * M by the exponentiation sum rule, and M^(p-1) = 1 mod p it then follows that

M^p = M mod p.

This is good news since we have now found a self-mapping function. Can we somehow decompose this self-map into two parts, one that we can call encryption and the other decryption?

Well imagine that Alice can find two values e and d such that e*d = p, and let Alice publish the pair (e, p) as her public exponent and her public modulus, respectively. If Bob wants to send her a message M she instructs him to encrypt it using the following modular exponentiation

C = M^e mod p

and Bob then sends C to Alice. When Alice receives C she recovers M by the following modular exponentiation

M = C^d mod p.

But why does this operation return M to Alice? Well its because

C^d = (M^e)^d = M^(e*d) = M^p = M.

Here we have used the exponentiation product rule: the exponentiation applied by Bob is itself exponentiated by Alice to return the original message.

But we have a snag. We said that Alice found e and d such that e*d = p, but since p is prime then the only possible solutions for e and d are 1 and p. But if e = 1 or e = p then the encryption operation M^e mod p applied by Bob has no effect on M since

M^1 = M^p = M mod p.

Encryption would be doing nothing.

But here is the key observation.

The property that permits Alice to recover M is not e*d = p but rather that

e*d = (p - 1) + 1 = 1*(p-1) + 1

and Alice can still recover M if e*d is equal to 2*(p-1) + 1, or 3*(p-1) + 1, or k*(p-1) + 1 for any k > 0. Why? Well using the product rule we have that

M^(k*(p-1)) = M^((p-1)*k) = (M^(p-1))^k = 1^k = 1 mod p,

and then using the sum rule

M^(k*(p-1) + 1) = M^(k*(p-1)) * M = M.

So we are back in business if we can find e and d such that e*d = k*(p-1) + 1, for some k > 0. But using modular arithmetic, this is the same as saying that

e*d = 1 mod (p - 1).

As it turns out there is something called the Euclidean algorithm which is a method for efficiently solving for x in modular equations of the form a*x = b mod m. So we can just pick a random value for e in the range 1 < e < p and then use the Euclidean algorithm to find the corresponding value of d for which e*d = 1 mod (p - 1).

Let's see an example of this. Let p = 61 and then select e as e = 17. Then we have that d = 53 since 17 * 53 = 1 mod 60, or more specifically, 17 * 53 = 15 * 60 + 1. So encryption would then be defined as C = M^17 mod 61 and decryption defined as M = C^53 mod 61. Here our messages would need to be less than 6 bits in length.

But now we have another snag - quite a serious one this time. When Alice publishes her public key (e, p), what distinguishes her from other people is her knowledge of her decryption key d. But since e*d = 1 mod (p - 1) then another person Carol can easily determine d by simply using the Euclidean algorithm to solve e*d = 1 mod (p - 1) since e and p - 1 can be directly derived from the public (e, p).

To fix this snag we need to change our tactics and move from a public modulus that is a prime p to a value n = p*q that is the product of two primes p and q.

Correct RSA v2

The importance of Fermat's Little Theorem in the last section is that it gave us a self-mapping modulo a prime p. There is a generalisation of this result for composite numbers known as Euler's Theorem: For any n let a be any value in the range 0 < a < n, that shares no factors with n then

a^phi(n) = 1 mod n

where phi(n) is the number of values in the range 1 to n that share no factors with n. Note here that for a prime p that phi(p) = p - 1 so that Euler's criterion includes the result of Fermat's Little Theorem.

If we let n = p*q for two primes p and q then it is easy to show that phi(n) = (p-1)*(q-1). Now we can repeat the construction in the previous section by replacing computations with p - 1 (derived from Fermat's Little Theorem) with computations based on phi(n). To this end we select a random e in the range 1 < e < phi(n) and use the Euclidean algorithm to solve for d such that

e*d = 1 mod phi(n) or e*d = 1 mod (p-1)(q-1).

Alice's public key becomes (e, n) and Bob can encrypt a message M for her as M^e mod p. Alice decrypts the message as M = C^d mod p which works because

C^d = (M^e)^d = M^(e*d) = M^(k*phi(n)+1 ) = 1^k * M = M mod n.

Let's illstrate these principles with the RSA example from Wikipedia. Let Alice choose as her two primes p = 61 and q = 53. Then n = 61 * 53 = 3233 and phi(n) = (61 - 1)*(53 - 1) = 3120. If Alice selects her public exponent to be e = 17 then d = 2753 since

17 * 2753 = 46801 = 15 * 3120 + 1.

So the public key of Alice is then (17, 3233) and anyone can encrypt a message for Alice by computing C = M^17 mod 3233. Alice recovers the message by computing M = C^2753 mod 3233.

By now you should be able to follow this T-shirt description of RSA.

rsa

Let's think about Carol again. Can she easily determine Alice's private key d from her public parameters (e, n)? Well if she could determine phi(n) from n then she could use the Euclidean algorithm to solve for d in the equation e*d = 1 mod phi(n). But as was shown in the original RSA paper, if Carol could derive phi(n) from n then this would be equivalent to Carol having the power to factor n, which is known to be a computationally hard problem.

We have a left out many technicalities in this description of RSA, and to implement RSA in practice you need to solve many other issues, such as how to find large primes. For RSA to be secure it must be the case that n is sufficiently large to defeat the best known factoring algorithms. Using current estimates p and q should be at least 500 bits in length, and often over 1000 bits for additional security. The construction for RSA as outlined above is totally general - you can pick the primes as small or large as you like - the math still all goes through. However, larger keys takes longer times to process.


Wednesday, June 18, 2008

Anonymity at the Edge

dan_egerstad_3Pictured left is Dan Egerstad, a 21-year old Swedish computer security consultant (at times also called a hacker, researcher and other colourful titles) who approaching a year ago now ran afoul of various authorities for his involvement in the exposure of account details harvested from the Internet-based anonymity service Tor. He obtained password information for 1,000 e-mail accounts belonging to foreign embassies, corporations and human rights organizations, and posted some details to his web site. For his trouble he was taken to the local police headquarters for questioning, and various computing equipment and written material were seized from his domicile. Egerstad claims he only wanted to bring attention to the inherent weaknesses in Tor that he expects are not well-known to its users. Another case of irresponsible disclosure or over-zealous awareness? Before examining the details of the Egerstad case we need some background on anonymity (the research material used to write this article was collected in Freemind and is available as a flash web application here).

When we speak of anonymity here we mean anonymity with respect to a digital action, such as sending an email, paying a bill or browsing a site. Such actions are considered to be anonymous if they can be attributed to nobody or attributed to everybody. In practice it is very difficult to ensure that digital actions are attributed to nobody since Internet protocols are very generous in the disclosing information as part of their operation. On the other hand, building systems that attribute actions to everyone is also impractical. Deployed internet anonymity solutions take the approach of making digital actions attributable to a sufficiently large set of users. So while it would be nice to have an anonymous email system whose messages could be attributed to anyone on the planet (once they all have email), the sender may feel that their anonymity is sufficiently protected if the system renders their email potentially attributable to a million people who were online when the email was sent. Anonymity loves a crowd, as the saying goes.

We will now give a short overview of how two well-known anonymity services work, and then get back to the transgressions of Mr. Egerstad.

Dealer-Based Systems

dogs-playing-poker In dealer-based systems, we imagine a large number of people sitting around a big table, with one person, Dixie, distinguished as the dealer. Each person at the table can send or receive a message from someone else at the table. When Alice wants to send a message to Bob, she puts the message in an envelope, writes Bob's name on the front, and passes the addressed envelope face down to Dixie. Dixie accepts the envelope from Alice, and continues collecting envelopes from people until she collects a pre-determined number of messages, let's say 100. Dixie then shuffles the 100 envelopes until their order is random as compared to the order in which they were received, and she then deals the envelopes out as they are addressed.

Someone at the table watching all this, say Eve, learns that Alice gave a message to Dixie, and that one of the people who was dealt an envelope by Dixie was her intended recipient (which must include Bob). So the anonymity of Alice's receiver might be said to be 1-in-100 assuming that the 100 messages collected by Dixie were delivered to 100 different people. If Bob is a popular fellow then he may have 10 messages dealt to him by Dixie, and the anonymity of Bob as Alice's receiver is reduced (there are fewer possible receivers to consider).

Here the value of 100 is called the threshold, and reaching the threshold number of messages triggers Dixie to shuffle and deliver. Alice may search for another table where the dealer Debbie has a threshold of 1000 say, because she wants to hide her recipient amongst a larger set of people. Alice will wait longer for her messages to be delivered since it will take Debbie more time to collect her threshold of 1000 envelopes as compared to Dixie's 100. There is tradeoff here between anonymity and latency: that is, how many other messages your message is shuffled with (the size of the threshold) and the delay to deliver of your messages (waiting for the dealer to gather the threshold).

The dealer can pull other tricks. A dealer Dylan may have a threshold of say 500 messages but on the first collection he actually accepts 600 envelopes. The 600 envelopes are shuffled, but he puts aside 100 (called the pool) and then delivers the other 500. The next 500 received messages are shuffled with the current pool, 100 set aside for the next pool, and the remaining 500 messages delivered. The pool breaks the direct correlation between when Alice submits her message and when it is dealt by Dylan. Eve will need to consider more potential receivers to determine who Alice might be communicating with.

What we have described here as dealer-based systems gives the simple intuition behind MIX anonymity systems, originally proposed by David Chaum, a prolific and influential cryptographic researcher. Chaum's seminal ideas have been greatly elaborated on in the last 20 years or so, and for a practical internet service providing anonymity based on the MIX principle, please see JonDonym.

Lotto-Based Systems

lottery_balls_tall The other class of anonymity systems we will consider shall be referred to as lotto-based systems (these are the ones that got Mr. Egerstad in hot water). We are proceeding by analogy as before with dealer-based systems. In the game of lotto players are attempting to predict which balls will be selected by a mechanical process that seems very random. A collection of numbered balls (say 40) are released into a clear and hard plastic sphere or cylinder, which when rotated, causes the balls to bounce around in an unpredictable manner. After a "sufficiently large" number of revolutions, an arm reaches into the sphere and catches one of the balls as it is bouncing around. The selected ball becomes the first number. The arm then reaches in again, catches another ball, and so on, until the required number of balls to complete the game are selected (say 6). All the time the sphere is rotating, and the remaining balls bounce off each other and the hard surface of the sphere.

Imagine now that instead of adding 40 balls at the beginning to the sphere, we keep on adding new balls, for example at the rate of a few per second. Also imagine that the arm does not just pull out 6 balls but rather it continuously extracts one ball at a time. We might think of this setup as the perpetual lotto system with balls continuously entering and leaving the system. If we put a ball into the sphere, how long would it take to come out? If we took the numbers off the balls (let them all just be white balls), when would we know that a given ball has come out?

These principles underpin several anonymity systems, such as Crowds. We replace the balls with messages and the rotating sphere becomes a collection of servers that randomly route (bounce) the messages amongst themselves before delivering the message to the intended recipient. In dealer-based systems the anonymity of Alice's receiver was tied to the size of the threshold used by the dealer. In lotto-based systems Alice's message is potentially mixed with all messages in the system at the same time, which may be a large depending on the traffic rate into the system.

Tor - The onion router

Tor is a particular, and seemingly popular, implementation of an anonymity network that falls into our simple lotto-based classification. Tor is short for "The onion router", and is considered to be a second generation version of onion routing for anonymous communication (onion routing 2.0 in common parlance). Here an onion refers to a message that is encrypted multiple times, or in layers. When Alice wants to send a message she first chooses a path through the Tor network based on a list of available servers. So instead of her message bouncing around randomly in the system (as in Crowds), she actually chooses the route before submitting the message. To protect her selected path she encrypts the message with the public key of the last server in the path, then encrypts that encrypted message with the key of the second last server in the path, and so on, for each server in her chosen path. The following diagram shows the process (see this wikipedia article for better resolution and more explanation).

Onion_diagram

Once all the layered encryption is completed, Alice sends her encrypted message (now called an onion) to the first server she selected in the path. The server decrypts one layer of the onion to reveal yet another onion and the next server in the path. The first server sends the onion to the second server in the path who decrypts the next layer and forwards the remaining onion to the next named server in the path, and so on. The final server in the path decrypts the onion one final time to reveal the original message and the intended recipient. The final server delivers the message to Alice's intended recipient.

Back to Dan

In Tor there are 3 roles for servers nodes: entry nodes that accept messages into the network, routing nodes that decrypt one onion layer and then forward the remaining onion, and finally exit nodes that deliver the message out of the network to the intended recipient. Any physical server in the Tor network can act in any of these node roles. What is revealed about Alice? Well the entry node sees her IP address, while the routing nodes learn the previous and next nodes in the path. However the exit nodes learns the message itself and the address of the recipient. If Alice wants this last hop to be secure then she must explicitly use a protocol like SSL to protect her message. But apparently this is a little known fact as Dan Egerstad demonstrated. This caveat is (now) clearly indicated on the Tor project site, and a general warning: Be smart and learn more. Understand what Tor does and does not offer. This list of pitfalls isn't complete, and we need your help identifying and documenting all the issues.

Servers can be donated to the Tor network to act as nodes, and currently there are about 1500 nodes. So Dan Egerstad created/donated 5 exit nodes and sniffed the traffic contents of these nodes as they forwarded traffic out of the Tor network. He was not targeting any specific users of Tor, only looking at the traffic that was routed to his exit node as part of the overall Tor protocol.

Surprisingly Egerstad obtained login and password information for over 1,000 e-mail accounts belonging to foreign embassies, corporations and human rights organizations. More specifically he harvested information on users from embassies belonging to Australia, Japan, Iran, India and Russia, and also accounts information belonging to the foreign ministry of Iran, the United Kingdom's visa office in Nepal and the Defence Research and Development Organization in India's Ministry of Defence. Egerstad ensured controversy (and more than that actually) when he published the login and password details for 100 select email accounts, using the justification that he felt it would be the most effective way to make the account owners aware that their communication had been compromised. Mr. Egerstad has stated that there is no security flaw with Tor - the real threat comes from user expectations that their message contents are being protected end-to-end by Tor, when in fact encryption is only applied to internal Tor network communication.

Lessons Learned

Mr. Egerstad has highlighted that Tor obfuscates the path that a message follows through the Tor network (protecting the sender), but confidentiality is not provided at the exit point to the network. Of course the Tor people have documented this property, but there is nothing like an incident to drive the message home. Users should also understand that the role of the exit nodes is still based on the honour system, for both delivery and respecting the privacy of message content.

The main threat addressed by the design of Tor is protecting the identity of Alice from eavesdroppers - that is, it should be difficult to correlate the identity of a sender through the routing of her messages in the Tor network. So if an attacker is targeting Alice they may have to compromise a large number of nodes (or introduce a large number of their own nodes) into the Tor network to trace a path back to Alice. This sounds like hard work. Mr. Egerstad has shown how to harvest account details by merely attaching a new exit node and eavesdropping on the Tor business-as-usual traffic that is routed through the node by user path choices. So while passive eavesdropping may not compromise a specific person like Alice, it can still reveal valuable security information for the attacker.

Providing open internet-based anonymity services is technically difficult, and the current set of services being offered approximate true anonymity. True anonymity, or anything approaching it, can only be provided in closed systems where all participants work synchronously, sending messages of a predetermined size at predetermined times. It would take a long time to justify this statement. I refer the interested reader to the excellent set of lectures on anonymity primitives and communication by George Danezis. Research in providing better anonymity systems is still an active area.

The research material used to write this article was collected in Freemind and is available as a flash web application here.

Monday, June 2, 2008

Goodbye Yellow Brick Road


In 2003 the Computer Research Association sponsored a workshop on the Grand Research Challenges in Information Security & Assurance. The by-invitation-only event brought together 50 scientists, educators, business people, futurists, and others who have some vision and understanding of the big challenges (and accompanying advances) that should shape the research agenda in this field over the next few decades. The final report listed 4 main challenges worthy of sustained resourcing and effort:
  1. Eliminate epidemic-style attacks within the next decade
  2. Develop tools and principles that allow construction of secure large-scale systems
  3. Give end-users security controls they can understand and privacy they can control for the dynamic, pervasive computing environments of the future
  4. Develop quantitative information-systems risk management to be at least as good as quantitative financial risk management within the next decade.

In the 4th challenge security (risk) professionals are being asked to follow the yellow brick road to the emerald city of quantitative financial risk management (QFRM) and the wizards therein. A recent article from a May issue of the Economist examines the state of QFRM in light of the subprime debacle, highlighting the $30 billion write down of UBS (Used to Be Smart) as the (sub)prime example of flawed risk management. The outlook in the emerald city is professionally gloomy.

One of the main quantitative culprits identified by the Economist is Value-at-Risk, usually written as the contraction VaR (presumably to distinguish it from VAR and Var, long standing contractions for the Variance). VaR is the centrepiece in many QFRM toolboxes, being honoured with inclusion in the Basel II guidelines for calculating reserve capital. But the subprime debacle has highlighted one the well-known weaknesses of VaR - that it is not strong at predicting the low-probability/high-impact events that are attendant to catastrophe.

VaR is essentially a simple concept supported by arbitrarily complex modelling (see this paper from the upcoming WEIS 2008 conference for VaR applied to information security). Let A be an asset for which we may realise a loss over a defined time period T. Given a level of significance a, the VaR of A is the maximum loss L that will occur over the time period T with probability 1 - a. So if A is a stock of interest over the next T = 100 days, and we fix a to be 0.01, then the VaR of A is the maximum loss L that will occur over the next 100 days with probability 0.99 (or 99% of the time). The interpretation here is that the losses from A will be at most L on 99 out of 100 days of trading.

But what about that other one day of trading not covered? The magnitude of the loss on that rogue day is outside the scope of the VaR model, and need not be proportional to the loss bound predicted by VaR. In fact, VaR is designed to answer questions of the form "Within reason, how bad can things get?", which seem very sensible until we acknowledge that the subprime debacle was not "within reason". As the Economist observes, after a few years of bootstrapping and estimation, VaR models are transferred onto historical data and settle down to predicting a stable future from a stable past, leading to the conceit that risks can be quantified and regulated.

One pyrrhic benefit from the subprime debacle is that VaR models can now be recalibrated with catastrophic data sets, and should therefore produce better predictions. The Economist notes a new interest in non-statistical models based on enumerating risk scenarios that describe what could go wrong, and then thinking through the consequences of the scenario crystallizing. Scenario generation is typically a people-intensive activity, facilitated through brainstorming and workshops - not the forte of quants. Nonetheless scenario-driven risk analysis (SDRA) has the ability to uncover root causes and dependencies that may be absent or insufficiently weighted in quantitative models. On the other hand, the SDRA may fail to generate an exhaustive set of material scenarios, and more mundanely, poorly facilitated sessions can lead to futile bickering over rating the significance of scenarios.

tangle Regardless of the model used, the Economist notes that risk management is becoming less tractable due to complexities and dependencies. Partitioning risks into those related to credit, markets and liquidity is no longer sufficient since the risk inherent in some of the subprime financial products did not respect these organisational boundaries. In short, we have entanglement. Further, obtaining aggregate risk positions is becoming more difficult since some departments still maintain their risk exposure on desktop Excel models, and over-the-counter dealings that are not formally traded also contribute to uncertain aggregate positions. For shareholders, and even regulators, it is very difficult to unwind and assess the risk exposure of a company.

What conclusions might we have for IT (Security) risk management? Rich Bejtlich has commented on the Economist article, and made direct comparisons to the difficulties of risk in financial environments to those in IT environments. The good news is that we in IT Risk should no longer feel compelled to wed our futures to the QFRM yellow brick road, and perhaps we are better served by SDRA. We can also stop beating ourselves up on the point that the weakness of IT Risk is the absence of data - the real weakness is poor modelling, and the decisions based on the output of such models. The Computer Research Association grand challenges of 2003 may be just too grand, and in fact unnecessary.

Saturday, May 24, 2008

More on Counting Restrictive Password Spaces

In a recent post I discussed how to use the Inclusion-Exclusion Principle (IEP) to count restrictive password spaces. In particular, the post considered how the number of passwords satisfying a restricted composition policy (at least one character that is lowercase L, uppercase U, numeric N, and a symbol S) reduced the size of the unrestricted password space (any combination of those character sets). In this post we will refer to this policy as ALO (at least one), and we examine the impact of two more restrictive policies using the same methods.

The first additional policy we consider is the Windows Server 2000/2003 (Win2KSvr) policy which requires at least 3 characters from the sets L, U, N and S. The second additional policy is from Sun and states that passwords must have at least 2 characters from L and U, as well as at least one character from N and S. The Win2KSvr policy is less restrictive than the ALO policy, while the Sun policy is substantially more restrictive than ALO.

For the impatient, the counting results are presented in the table below. For each password length n, 100% corresponds to (94)^n, denoting all possible passwords based on the 94 printable ASCII characters (26 uppercase characters, 26 lowercase characters, 10 numbers and 32 symbols). The numbers in the table denote reduction percentages of the password space due to the stated policy restrictions, and have been rounded to the nearest whole number.

Length Win2KSvr
ALO Sun
6 85 27 1
7 91 37 6
8 95 46 14
9 97 54 23
10 98 60 32
11 99 65 41
12 99 70 49
13 100 74 56
14 100 77 63
15 100 80 68
16 100 82 73
17 100 84 77
18 100 86 80
19 100 88 83
20 100 89 85
30 100 97 96
40 100 99 99
50 100 100 100

We see from the results that the Win2KSvr policy is not very restrictive when passwords are at least 6 characters (85% satisfy the policy), and at 8 characters, a common password length, 95% of passwords satisfy the policy. So this policy should not be very burdensome for users to follow. The Sun policy is very restrictive and is only satisfied by a mere 1% of length 6 passwords, and just 14% of length 8 passwords. Clearly users and administrators will need to make deliberate password selections to satisfy this policy.

These results may seem alarming, in that the restrictive policies are severely reducing the set of possible passwords. However we cannot assess the nature of the threat here without the full context. I would recommend to read Password Exhaustion: Predicting the End of Password Usefulness, a technical report from researchers at Pennsylvania State University, which provides an excellent review of password strength in the near future.

Read on to see how the numbers in the above table were calculated.


Using the IEP for counting

We are interested in counting the set of passwords that satisfy a given restrictive policy (filter). We will determine the size of the restrictive password set as follows:

  1. Define a collection of sets whose union fully characterizes the set of passwords that do not satisfy the restrictive policy.

  2. Use the IEP to compute the size of the union of the sets defined in Step 1.

  3. Subtract the result from Step 2 away from the number of unrestricted passwords (the full password space).

Windows Server 2000/2003 Policy

As expected, we start by defining the sets for Step 1. If a password does not satisfy the Win2kSvr policy then characters from at least two of L, U, N or S must be missing (not selected by the user). Let LU be the set of passwords that do not contain lowercase or uppercase letters. Similarly define LN, LS, UN, US and NS as sets of passwords that do not contain characters from the two stated sets from L, U, N and S. It should be clear that if a password fails to meet the Win2kSvr policy then it must be a member of at least one of the 6 sets LU, LN, LS, UN, US or NS . Equivalently, the union of these 6 sets characterizes passwords that are constructed from at most two of the sets L, U, N and S. Thus we have completed Step 1.

To apply the IEP we must determine the size of intersecting these 6 sets taken individually, 2 at a time, 3 at a time and so on, up to 6 at a time. All in all there are 63 = 2^6-1 intersections to consider, but thankfully most of these intersections will have a size of zero.

To simplify notation, the intersection of LU and LN will be written as LUN, which denotes the set of passwords that do not contain a character from L, U or N (equivalently, the passwords must be all symbols). Similarly intersecting LU with NS will be denoted as LUNS and interpreted as a password that has no characters from L, U, N or S.

The LUNS set is empty as it is not possible to avoid all character sets, and therefore has size 0. It is easy to see that if we intersect 4 or more of LU, LN, LS, UN, US and NS then the intersection must be LUNS. Thus to compute the IEP we need only consider intersections of 3 or less sets. In the table below we show only the non-empty intersections.

Set Intersection Sign Size
LU LU + (94 - 26 - 26)^n
LN LN + (94 - 26 - 10)^n
LS LS + (94 - 26 - 32)^n
UN UN + (94 - 26 - 10)^n
US US + (94 - 26 - 32)^n
NS NS + (94 - 10 - 32)^n
LU, LN LUN - (94 - 10 - 32)^n
LU, UN LUN - (94 - 26 -26 - 10)^n
LU, LS LUS - (94 - 26 -26 - 32)^n
LU, US LUS - (94 - 26 -26 - 32)^n
LN, UN LUN - (94 - 26 -26 - 10)^n
LN, LS LNS - (94 - 26 -10 - 32)^n
LN, NS LNS - (94 - 26 -10 - 32)^n
UN, US UNS - (94 - 26 -10 - 32)^n
UN, NS UNS - (94 - 26 -10 - 32)^n
LS, US LUS - (94 - 26 -26 - 32)^n
LS, NS LNS - (94 - 26 -10 - 32)^n
US, NS UNS - (94 - 26 -10 - 32)^n
LU, LN, UN LUN + (94 - 26 -26 - 10)^n
LU, LS, US LUS + (94 - 26 -26 - 32)^n
LN, LS, NS LNS + (94 - 26 -10 - 32)^n
UN, US, NS UNS + (94 - 26 -10 - 32)^n

We see from the table above that when LU, LN, LS, UN, US and NS are intersected in pairs then at least 3 of the underlying character sets (L, U, N, S) are included, or covered, by the intersection. Thus LUN, LUS, LNS and UNS are the only non-zero intersections produced by the IEP. For example, LUN is generated in the table 4 times from in the following intersections

  • Either intersecting pairs from LU, LN, UN ,which can be done in the following 3 ways (LU, LN), (LU, UN) or (LN, UN), or
  • Intersecting all three of LU, LN, UN, which can only be done in one way

So the contribution of LUN to the password count is the 3*Size(LUN) - Size(LUN) = 2*Size(LUN). The same observation is true of LUS, LNS and UNS. Thus the computations in the previous table can be simplified as follows

Intersection Sign Size
LU + (94 - 26 - 26)^n
LN + (94 - 26 - 10)^n
LS + (94 - 26 - 32)^n
UN + (94 - 26 - 10)^n
US + (94 - 26 - 32)^n
NS + (94 - 10 - 32)^n
2*LUN - 2*(94 - 26 -26 - 10)^n
2*LUS - 2*(94 - 26 -26 - 32)^n
2*LNS - 2*(94 - 26 -10 - 32)^n
2*UNS - 2*(94 - 26 -10 - 32)^n

The table describes the computations to complete Step 2 as outlined above, and to complete Step 3 we need only subtract this value from the total number of passwords which is (94)^n.

A Sun Policy

The Sun policy, as referenced by researchers at Pennsylvania State University, states that passwords must have at least 2 lowercase and uppercase characters, as well as at least one character that is numeric and a symbol. In our case the judicious choice of sets for Step 1 is

  • L - the set of passwords that contain at most one lowercase letter

  • U - the set of passwords that contain at most one uppercase letter

  • N - the set of passwords that do not contain a number

  • S - as the set of passwords that do not contain a symbol

The union of L, U, N and S is the set of passwords that has at least one of the four character sets missing, and this represents all passwords that do not satisfy the stated Sun policy. The tricky counting comes for L and U. L denotes passwords that have at most one occurrence of a lower case letter, which we decompose into the two cases of zero occurrences or exactly one occurrence. Counting zero occurrences is just (94 - 26)^n. To count exactly one occurrence in a length n password we first select the position where the lower case letter will occur (there are n such choices), then choose amongst 26 possible letters to assign in the position, and select the remaining n-1 characters to exclude lower case letters, which can be done in (94 - 26)^(n-1) ways. Thus there are

n*26*(96 - 26)^(n-1)

ways to have exactly one lowercase letter, and in total there are then

(94 - 26)^n + n*26*(96 - 26)^(n-1)

passwords with at most one lowercase letter. The same calculation applies to counting U. An trickier intersection case to count is LU, or passwords that have at most one occurrence of both L and U. The subcases are considered in the table below.

Occurrence of L? Occurrence of U? Size
No No (94 - 26 -26)^n
Yes No n*26*(96 - 26 -26)^(n-1)
No Yes n*26*(96 - 26 -26)^(n-1)
Yes Yes n*(n-1)*(26^2)
*(94 - 26 - 26)^(n-2)

Give these calculations, it is not difficult to created the full intersection table for the IEP as shown below.

Intersection Sign Size
L + (94 - 26)^n
+ n*26*(96 - 26)^(n-1)
U + (94 - 26)^n
+ n*26*(96 - 26)^(n-1)
N + (94 - 10)^n
S + (94 - 32)^n
LU - (94 - 26 -26)^n
+ 2*n*26*(96 - 26 -26)^(n-1)
+ n*(n-1)*(26^2)*(94 - 26 - 26)^(n-2)
LN - (94 - 26 -10)^n
+ n*26*(96 - 26 - 10)^(n-1)
LS - (94 - 26 -32)^n
+ n*26*(96 - 26 - 32)^(n-1)
UN - (94 - 26 -10)^n
+ n*26*(96 - 26 - 10)^(n-1)
US - (94 - 26 -32)^n
+ n*26*(96 - 26 - 32)^(n-1)
NS - (94 - 10 - 32)^n
LUN + (94 - 26 - 26 - 10)^n
+ 2*n*26*(96 - 26 - 26 -10)^(n-1)
+ n*(n-1)*(26^2)*(94 - 26 - 26 - 10)^(n-2)
LUS + (94 - 26 - 26 - 32)^n
+ 2*n*26*(96 - 26 - 26 -32)^(n-1)
+ n*(n-1)*(26^2)*(94 - 26 - 26 - 32)^(n-2)
LNS + (94 - 26 - 10 -32)^n
+ n*26*(96 - 26 - 10 - 32)^(n-1)
UNS + (94 - 26 - 10 -32)^n
+ n*26*(96 - 26 - 10 - 32)^(n-1)
LUNS - 0

The table describes the computations to complete Step 2 as outlined above, and to complete Step 3 we need only subtract this value from the total number of passwords which is (94)^n.