DNS is a fundamental Internet service to resolve symbolic names into IP addresses (for example, DNS tells us that www.google.com is currently at IP address 209.05.129.99). While people are quite good at remembering symbolic names they are lousy at remembering IP addresses, and packets can't be routed using names like www.google.com. So DNS is the bridge between what people can remember and what Internet packets require.
When you attach your laptop to the Internet you are assigned a local (default) DNS server. As you enter new symbolic references to services, your laptop contacts your local DNS server for resolution, and your laptop stores the returned result in a local cache. If your local DNS server can't resolve your request then it passes it onto a "better informed" DNS server to for resolution. A resolution that involves a chain of DNS servers is called recursive. To your laptop the final resolution still looks like it was returned by your local DNS server. The trust between your laptop and your local DNA server is implicit - there is no formal authentication for resolution requests. What is returned by the local DNS server is accepted by your laptop as a correct resolution.
In 2002 it was announced that BIND, a common implementation of the DNS protocol, permitted DNS server resolutions to be falsified, called cache poisoning. The diagram below gives an overview of the attack. The goal of the attacker is to poison the cache of DNS server with false resolutions, and then poison the client cache on the next request to the DNS server.
There are many excellent articles that explain the DNS/BIND Birthday Attack and consider the following ones here and here. Both give details on the success of the birthday attack using the following formula for a collision (match) amongst n uniform samples from t possible values
But this collision formula is not quite correct. When n > t then a collision occurs with probability one, but the collision formula above will never be one since (1-1/t)^k is never zero for any value of k. What's the glitch?
Well n*(n-1)/2 is the number of possible pairings amongst n objects, each of which could yield a match. But instead of computing the match as an n-to-n problem, the collision formula computes the probability of a match as a 1-to-(n*(n-1)/2) problem. That is, the collision formula assumes that each of the potential matches from pairing n objects are independent. The collision formula therefore provides an underestimate on the correct probability, which is given as
The graph says that there is a 50% likelihood of a match around 300 samples (or DNS queries in this case), which seems fine since we expect a repeat at slightly more than sqrt(65536) = 256 samples, and the correct value is about 280. As we mentioned above the collision formula underestimates the correct probability, but for increasing n the error as compared to the correct value is decreasing. The yellow part of the graph shows the probability of the conventional spoofing attack where the attacker has a likelihood of n/65536 to match. In a conventional spoofing attack the attacker generates false replies to a single DNS request.