Sunday, August 24, 2008

Entropy Bounds for Traffic Confirmation

traffic analysis After quite a delay (since the initial idea) I have finally completed a good draft of some research on traffic confirmation attacks. Traffic analysis is a collection of techniques for inferring communication relationships without having access to the content that is being communicated. Traffic confirmation is a special subcase that attempts to infer communication relationships amongst users whose communications are mediated through an anonymity system.

We assume that the underlying anonymity system is a threshold MIX, with batch size B, supporting N participants, each of whom may act as both a sender or receiver. Typical MIX system parameters are N = 20,000 participants (all potentially send and receiving), and a batch of size B = 50 (the MIX accepts 50 messages, randomises their order, then delivers them). The MIX makes it difficult to correlate the 50 senders to the 50 receivers.

The traffic confirmation attack targets a specific user Alice and attempts to infer her M communication partners (who she is sending to) by observing her communication patterns through the MIX. Alice for example might be sending messages to M = 20 people on a regular basis.

The attacker observes when Alice sends a message in a batch, and can also observe who receives a message from that batch. This will include at least one of Alice's communication partners, but which one? The uncertainty that the attacker has with respect to the M communication of Alice is called the entropy her partners.

It is well-known that given the properties of the MIX and the power of the adversary (what she can see), that the entropy of Alice's partners is decreasing as the number of message batches that Alice participates in are observed (a higher entropy means more uncertainty). Eventually the entropy of her partners will go to zero and their identities will be known to the attacker. But how many such observations need to be made before the anonymity of Alice's communications partners is effectively zero?

My paper shows that after the attacker has observed approximately

M * log N

messages, subsequent messages drive the entropy to zero at a geometric rate. That is, the MIX system protects Alice's partners to a point but then essentially collapses from that point forward. So if the attacker is prepared to wait for (and observe) those first M * log N messages from Alice, then the uncertainty of her partners reduces rapidly as Alice sends additional messages. One reason that the attacker has to wait for that first group of messages to be sent by Alice is that she won't be able to uniquely determine Alice's communication partners until Alice has communicated with all of them at least once.

By way of an example, consider a system with N = 20,000 participants, a batch size of B = 50, and Alice having 20 communicating partners. In this case, after observing 292 initial messages, the attacker can identify Alice's partners with 99% certainty after observing an additional 110 messages (that's 402 messages in total). So way before she has sent 400 messages, Alice should either change her partners or change her MIX system (probably the later).

You can find the draft paper here. Comments welcome!

Tuesday, August 19, 2008

Quantum Computing: are you Shor?

qc1 Quantum computing ringing the imminent death knell of encryption is a familiar refrain in the popular press. The end of a dynasty is close at hand. A recent article in IT News is typical when it states that advances in quantum computing are likely to "render[ing] digital encryption techniques all but obsolete". Not quite.

Quantum computers are very different from the ones we have on our desktops now, usually referred to as classical computing devices. A potential misconception is to think that quantum computers will teleport us to a future where the benefits of Moore's Law compounded over several centuries can be reaped. Intel's leading chip line, the x86, has steadily progressed from 286, 386, 486, Pentium and so on, but quantum computers will not be "1000-86" devices - unimaginably faster versions of what we have today.

Quantum computers are built from different computational primitives than our present devices. It is difficult to formally explain these differences but the main idea is that quantum computers can operate in many states simultaneously whereas classical computers are built from (bipolar) components that are interpreted to operate in one of two states - 0 or 1, on or off.

The massive type of parallelism furnished by quantum logic cannot as yet be used to solve general computational problems. A specific quantum algorithm must be tailored for each problem class of interest - such algorithms are more a bespoke than off-the-shelf technology. The quantum algorithm that has received the most attention with respect to security was devised by Peter Shor, and proves that factoring is efficient on a quantum computer. When you read that current encryption methods will be rendered useless by quantum computation, the authors are usually referring to Shor's algorithm.

Shor's Algorithm

shor How does Shor's algorithm work? Well it uses a quantum algorithm to solve a particular equation whose solution is likely to factor a number. For the last 30 years or so, most modern factoring algorithms have been converging on a single idea, which is the following. Say that N = P*Q is the number that you wish to factor. Assume that you can find two numbers X and Y such that

X^2 = Y^2 (mod N)

which means that the squares of X and Y leave the same remainder when divided by N. It then follows that

X^2 - Y^2 = 0 (mod N)

(X - Y)(X + Y) = 0 (mod N)

so that (X - Y)(X + Y) divides N. It can be shown that there is at least a 50% chance that either (X - Y) or (X + Y) will be divisible by P or Q, and therefore share a nontrivial factor with N. If this is the case, then N can be factored by computing gcd(X - Y, N) and gcd(X + Y, N).

 

Let's call X and Y a magic pair, and if we can generate enough magic pairs for a given N then we can factor it (think of flipping a coin and waiting for it to turn up heads). So given that factoring is generally believed to be difficult then generating magic pairs must be difficult as well. Most modern factoring algorithms are in fact clever ways to generate magic pairs. While these methods are much faster than brute force approaches, they are far from being efficient.

The breakthrough from Shor was to devise an efficient quantum method for generating magic pairs, which therefore also makes factoring efficient. He did this by reducing the problem of finding magic pairs to what is called computing orders.

Let Z be a number such that 1 < Z < N, and gcd(N, X) = 1 (that is, a number less than N and relatively prime to N). When these conditions hold, there is a smallest number K < N such that

Z^K = 1 (mod N)

This value of K is called the order of Z (modulo N). If K is even, say equal to 2*D, then we have that

Z^(2*D)  =  (Z^D)^2  =  1 (mod N).

But we can take the square root of both sides of the equation

(Z^D)^2 = 1^2 (mod N)

to yield a magic pair by setting X = Z^D and Y = 1. So finding a number with even order is the same as finding a magic pair.

Shor invented a brilliant quantum algorithm for efficiently computing orders (the details of which are to complicated to reproduce or even summarise here). Then to factor a number N, just generate a random number Z and quantum compute its order. If the order is even, then create a magic pair and try to factor N; if N does not factor, or the order is not even, then choose another Z and repeat. You wont have to do too many iterations before N will be factored with high probability.

So it is not so much that factoring can be solved easily on a quantum computer but rather that computing orders is easily solved on a quantum computer and factoring can be reduced to computing orders. Any cryptosystem whose security can be reduced to computing orders will also be insecure. Unfortunately this is the case for most well-known public key cryptosystems, including RSA, discrete logarithm based system like Diffie-Hellman and DSA, as well as elliptic curve systems. It is easy to see how Shor's Algorithm quickly becomes a the-sky-is-soon-to-be-falling story line in the press.

However, there are public key systems that are not impacted by Shor's algorithm. One such scheme is the McEliece cryptosystem whose security is derived from intractable problems in coding theory, which has nothing to do with computing orders. The McEliece cryptosystem is not commonly deployed but perhaps its day will come.

Quantum Speculation 

We remarked at the beginning of this post that quantum computing is not a technology for cheating Moore's Law - quantum computers are not general devices that will bring with them exponential improvements over what we have today. We need to craft individual quantum algorithms for each class of problem that we want to solve rapidly on a quantum platform. Computing orders is one such general problem whose quantum solution has dire implications for most of today's common public key systems.

The quantum future that brings the end to the RSA dynasty is quite a few  decades away, if not more. Your great grandchildren's RSA keys may be under threat, if we are still using RSA then. You may also have some reason for concern if you are encrypting or signing data with public keys that you are assuming will be secure for the next 50 years or so.

Today from a risk perspective quantum computers are a low-probability high-impact event - a coming catastrophe for many extant public key systems. In 100 years quantum computers are likely to be a high-probability high-impact event - that is, a commonplace tool being used for better or worse.

Our RSA keys are safe for some time yet, until quantum computation matures, and it may even turn out that large scale implementations with many quantum logic gates are difficult to engineer. Certain public key systems will survive even sophisticated quantum computer deployments in any case, unless as yet undiscovered specific quantum algorithms can be tailored to these systems as well. One can speculate endlessly.

Friday, August 8, 2008

Long Tail of Vulnerability for A5/1

Stream ciphers are a special class of cipher, often used for fast encryption of data streams such as dedicated network links or fax lines. Stream ciphers produce a key stream that is simply added (XOR-ed) with the data stream to produce ciphertext, and as such can achieve high encryption rates especially if implemented in hardware.

Below is a diagram of the A5/1 stream cipher, invented in 1987 to protect GSM voice and data traffic. A5/1 loads a 64-bit secret key into its 3 shift registers, which are then clocked and combined with a frame vector to produce an initial state. A5/1 is then clocked as required to produce keystream bits to encrypt the next 114 bit GSM frame.

For over 10 years theoretical attacks on A5/1 have been postulated that lower the cost of attack significantly over searching the 64-bit key space directly. Amongst cryptographers and mobile security people, A5/1 is considered to be insecure and its 64-bit key size woefully inadequate by today's standards.

In February David Hulton, director of applications for the high-performance computing company Pico, and Steve Muller, a researcher for mobile security firm CellCrypt, announced new results which claim that A5/1 keys can be recovered in 30 minutes for $1000. The attack is both passive (no network injection or basestation masquerading) and can be executed remotely (no proximity to the target device required). Further, Hulton & Muller (H&M) are patenting their optimizations and intend to sell an A5/1 key recovery tool commercially. Cell call snooping may soon be affordable by many groups and people, not just well-resourced intelligence agencies.

The attack is based on pre-computing rainbow tables that enable the direct "lookup" of A5/1 keys when indexed by a small amount of observed traffic (3 to 4 GSM frames). While it remains impractical to create a rainbow table that has one entry for each each of the possible 2^(64) A5/1 keys, H&M have devised a method that only requires a table of size 2^(58). Dedicated FPGA devices are currently being deployed to speed up the required rainbow table computations, and are expected to be completed by the end of March. By way of comparison, it is estimated that attempting the same computation on a lone PC would take 33 thousand years.

The resulting rainbow tables will be several terabytes in size, and will be made available to the public. H&M also intend to sell a fast cracking version of the software that can break GSM encryption in just 30 seconds, charging between $200,000 and $500,000 for this privilege. Perhaps they will also offer service of recovering of GSM keys based on submitted traffic samples, giving rise to a instance of cryptanalysis-as-a-service (CaaS).

While researchers and security professionals view the H&M approach as cheaper and more efficient than previous attacks, the attack itself comes as no surprise given the known weaknesses of A5/1. The practicality of the H&M approach should be the final piece of evidence to convince operators to move off A5/1 as a basis for secure and private for GSM communication. David Pringle, a spokesperson for the GSM Association (GSMA), has declined to comment on the H&M attack until further details are known.

A5/1 has operated unchanged for the last 21 years but it has now reached its cryptographic end-of-life, engulfed by the march of Moore's Law. However, the operational end-of-life of A5/1 may still be decades away as there are approximately 2 billion GSM subscribers, commanding about 80% of the global mobile market. This would be a tough product recall indeed. A5/1 is well-positioned to become the NT of the mobile crypto world, and I see the makings of a long tail of GSM vulnerability.

You can find the research used to produce this post as a FreeMind mindmap rendered into Flash here.


Related Posts