Saturday, May 24, 2008

More on Counting Restrictive Password Spaces

In a recent post I discussed how to use the Inclusion-Exclusion Principle (IEP) to count restrictive password spaces. In particular, the post considered how the number of passwords satisfying a restricted composition policy (at least one character that is lowercase L, uppercase U, numeric N, and a symbol S) reduced the size of the unrestricted password space (any combination of those character sets). In this post we will refer to this policy as ALO (at least one), and we examine the impact of two more restrictive policies using the same methods.

The first additional policy we consider is the Windows Server 2000/2003 (Win2KSvr) policy which requires at least 3 characters from the sets L, U, N and S. The second additional policy is from Sun and states that passwords must have at least 2 characters from L and U, as well as at least one character from N and S. The Win2KSvr policy is less restrictive than the ALO policy, while the Sun policy is substantially more restrictive than ALO.

For the impatient, the counting results are presented in the table below. For each password length n, 100% corresponds to (94)^n, denoting all possible passwords based on the 94 printable ASCII characters (26 uppercase characters, 26 lowercase characters, 10 numbers and 32 symbols). The numbers in the table denote reduction percentages of the password space due to the stated policy restrictions, and have been rounded to the nearest whole number.

Length Win2KSvr
ALO Sun
6 85 27 1
7 91 37 6
8 95 46 14
9 97 54 23
10 98 60 32
11 99 65 41
12 99 70 49
13 100 74 56
14 100 77 63
15 100 80 68
16 100 82 73
17 100 84 77
18 100 86 80
19 100 88 83
20 100 89 85
30 100 97 96
40 100 99 99
50 100 100 100

We see from the results that the Win2KSvr policy is not very restrictive when passwords are at least 6 characters (85% satisfy the policy), and at 8 characters, a common password length, 95% of passwords satisfy the policy. So this policy should not be very burdensome for users to follow. The Sun policy is very restrictive and is only satisfied by a mere 1% of length 6 passwords, and just 14% of length 8 passwords. Clearly users and administrators will need to make deliberate password selections to satisfy this policy.

These results may seem alarming, in that the restrictive policies are severely reducing the set of possible passwords. However we cannot assess the nature of the threat here without the full context. I would recommend to read Password Exhaustion: Predicting the End of Password Usefulness, a technical report from researchers at Pennsylvania State University, which provides an excellent review of password strength in the near future.

Read on to see how the numbers in the above table were calculated.


Using the IEP for counting

We are interested in counting the set of passwords that satisfy a given restrictive policy (filter). We will determine the size of the restrictive password set as follows:

  1. Define a collection of sets whose union fully characterizes the set of passwords that do not satisfy the restrictive policy.

  2. Use the IEP to compute the size of the union of the sets defined in Step 1.

  3. Subtract the result from Step 2 away from the number of unrestricted passwords (the full password space).

Windows Server 2000/2003 Policy

As expected, we start by defining the sets for Step 1. If a password does not satisfy the Win2kSvr policy then characters from at least two of L, U, N or S must be missing (not selected by the user). Let LU be the set of passwords that do not contain lowercase or uppercase letters. Similarly define LN, LS, UN, US and NS as sets of passwords that do not contain characters from the two stated sets from L, U, N and S. It should be clear that if a password fails to meet the Win2kSvr policy then it must be a member of at least one of the 6 sets LU, LN, LS, UN, US or NS . Equivalently, the union of these 6 sets characterizes passwords that are constructed from at most two of the sets L, U, N and S. Thus we have completed Step 1.

To apply the IEP we must determine the size of intersecting these 6 sets taken individually, 2 at a time, 3 at a time and so on, up to 6 at a time. All in all there are 63 = 2^6-1 intersections to consider, but thankfully most of these intersections will have a size of zero.

To simplify notation, the intersection of LU and LN will be written as LUN, which denotes the set of passwords that do not contain a character from L, U or N (equivalently, the passwords must be all symbols). Similarly intersecting LU with NS will be denoted as LUNS and interpreted as a password that has no characters from L, U, N or S.

The LUNS set is empty as it is not possible to avoid all character sets, and therefore has size 0. It is easy to see that if we intersect 4 or more of LU, LN, LS, UN, US and NS then the intersection must be LUNS. Thus to compute the IEP we need only consider intersections of 3 or less sets. In the table below we show only the non-empty intersections.

Set Intersection Sign Size
LU LU + (94 - 26 - 26)^n
LN LN + (94 - 26 - 10)^n
LS LS + (94 - 26 - 32)^n
UN UN + (94 - 26 - 10)^n
US US + (94 - 26 - 32)^n
NS NS + (94 - 10 - 32)^n
LU, LN LUN - (94 - 10 - 32)^n
LU, UN LUN - (94 - 26 -26 - 10)^n
LU, LS LUS - (94 - 26 -26 - 32)^n
LU, US LUS - (94 - 26 -26 - 32)^n
LN, UN LUN - (94 - 26 -26 - 10)^n
LN, LS LNS - (94 - 26 -10 - 32)^n
LN, NS LNS - (94 - 26 -10 - 32)^n
UN, US UNS - (94 - 26 -10 - 32)^n
UN, NS UNS - (94 - 26 -10 - 32)^n
LS, US LUS - (94 - 26 -26 - 32)^n
LS, NS LNS - (94 - 26 -10 - 32)^n
US, NS UNS - (94 - 26 -10 - 32)^n
LU, LN, UN LUN + (94 - 26 -26 - 10)^n
LU, LS, US LUS + (94 - 26 -26 - 32)^n
LN, LS, NS LNS + (94 - 26 -10 - 32)^n
UN, US, NS UNS + (94 - 26 -10 - 32)^n

We see from the table above that when LU, LN, LS, UN, US and NS are intersected in pairs then at least 3 of the underlying character sets (L, U, N, S) are included, or covered, by the intersection. Thus LUN, LUS, LNS and UNS are the only non-zero intersections produced by the IEP. For example, LUN is generated in the table 4 times from in the following intersections

  • Either intersecting pairs from LU, LN, UN ,which can be done in the following 3 ways (LU, LN), (LU, UN) or (LN, UN), or
  • Intersecting all three of LU, LN, UN, which can only be done in one way

So the contribution of LUN to the password count is the 3*Size(LUN) - Size(LUN) = 2*Size(LUN). The same observation is true of LUS, LNS and UNS. Thus the computations in the previous table can be simplified as follows

Intersection Sign Size
LU + (94 - 26 - 26)^n
LN + (94 - 26 - 10)^n
LS + (94 - 26 - 32)^n
UN + (94 - 26 - 10)^n
US + (94 - 26 - 32)^n
NS + (94 - 10 - 32)^n
2*LUN - 2*(94 - 26 -26 - 10)^n
2*LUS - 2*(94 - 26 -26 - 32)^n
2*LNS - 2*(94 - 26 -10 - 32)^n
2*UNS - 2*(94 - 26 -10 - 32)^n

The table describes the computations to complete Step 2 as outlined above, and to complete Step 3 we need only subtract this value from the total number of passwords which is (94)^n.

A Sun Policy

The Sun policy, as referenced by researchers at Pennsylvania State University, states that passwords must have at least 2 lowercase and uppercase characters, as well as at least one character that is numeric and a symbol. In our case the judicious choice of sets for Step 1 is

  • L - the set of passwords that contain at most one lowercase letter

  • U - the set of passwords that contain at most one uppercase letter

  • N - the set of passwords that do not contain a number

  • S - as the set of passwords that do not contain a symbol

The union of L, U, N and S is the set of passwords that has at least one of the four character sets missing, and this represents all passwords that do not satisfy the stated Sun policy. The tricky counting comes for L and U. L denotes passwords that have at most one occurrence of a lower case letter, which we decompose into the two cases of zero occurrences or exactly one occurrence. Counting zero occurrences is just (94 - 26)^n. To count exactly one occurrence in a length n password we first select the position where the lower case letter will occur (there are n such choices), then choose amongst 26 possible letters to assign in the position, and select the remaining n-1 characters to exclude lower case letters, which can be done in (94 - 26)^(n-1) ways. Thus there are

n*26*(96 - 26)^(n-1)

ways to have exactly one lowercase letter, and in total there are then

(94 - 26)^n + n*26*(96 - 26)^(n-1)

passwords with at most one lowercase letter. The same calculation applies to counting U. An trickier intersection case to count is LU, or passwords that have at most one occurrence of both L and U. The subcases are considered in the table below.

Occurrence of L? Occurrence of U? Size
No No (94 - 26 -26)^n
Yes No n*26*(96 - 26 -26)^(n-1)
No Yes n*26*(96 - 26 -26)^(n-1)
Yes Yes n*(n-1)*(26^2)
*(94 - 26 - 26)^(n-2)

Give these calculations, it is not difficult to created the full intersection table for the IEP as shown below.

Intersection Sign Size
L + (94 - 26)^n
+ n*26*(96 - 26)^(n-1)
U + (94 - 26)^n
+ n*26*(96 - 26)^(n-1)
N + (94 - 10)^n
S + (94 - 32)^n
LU - (94 - 26 -26)^n
+ 2*n*26*(96 - 26 -26)^(n-1)
+ n*(n-1)*(26^2)*(94 - 26 - 26)^(n-2)
LN - (94 - 26 -10)^n
+ n*26*(96 - 26 - 10)^(n-1)
LS - (94 - 26 -32)^n
+ n*26*(96 - 26 - 32)^(n-1)
UN - (94 - 26 -10)^n
+ n*26*(96 - 26 - 10)^(n-1)
US - (94 - 26 -32)^n
+ n*26*(96 - 26 - 32)^(n-1)
NS - (94 - 10 - 32)^n
LUN + (94 - 26 - 26 - 10)^n
+ 2*n*26*(96 - 26 - 26 -10)^(n-1)
+ n*(n-1)*(26^2)*(94 - 26 - 26 - 10)^(n-2)
LUS + (94 - 26 - 26 - 32)^n
+ 2*n*26*(96 - 26 - 26 -32)^(n-1)
+ n*(n-1)*(26^2)*(94 - 26 - 26 - 32)^(n-2)
LNS + (94 - 26 - 10 -32)^n
+ n*26*(96 - 26 - 10 - 32)^(n-1)
UNS + (94 - 26 - 10 -32)^n
+ n*26*(96 - 26 - 10 - 32)^(n-1)
LUNS - 0

The table describes the computations to complete Step 2 as outlined above, and to complete Step 3 we need only subtract this value from the total number of passwords which is (94)^n.