Sunday, June 14, 2009

Enterprise Password Management Guidelines from NIST

Those industrious people over at NIST have produced another draft publication in the SP-800 series on Guidelines for Enterprise Password Management. At 38 pages it will be a slim addition to your already bulging shelf of NIST reports. The objective of the report is to provide recommendations for password management, including “the process of defining, implementing, and maintaining password policies throughout an enterprise”. The report is consistently sensible and, in places, quite sagely. Overall the message is that passwords are complex to manage effectively, and you will need to spend considerable time and effort on covering all the bases. My short conclusion is that a CPO – a Chief Password Officer – is required.

Let’s begin with a definition. A password is “a secret (typically a character string) that a claimant uses to authenticate its identity”. The definition includes the shorter PIN variants, and the longer passphrase variants of passwords. Passwords are a factor of authentication, and as is well-known, not a very strong factor when used in isolation. Better management of the full password lifecycle can reduce the risks of security exposures from password failures. This NIST document will help your enterprise get there.

Storage and Transmission

NIST begins by discussing password storage and transmission, since enforcing more stringent password policies on users is counterproductive if those passwords are not adequately protected while in flight and at rest. Web browsers, email clients, and other applications may store user passwords for convenience but this may not be done in a secure manner. There is an excellent article on Security Focus by Mikhael Felker from 2006 on password storage risks for IE and FireFox. In general, applications that store passwords and automatically enter them on behalf of a user make unattended desktops more attractive to opportunistic data thieves in the workplace for example. Further, as noted in the recent Data Breach Investigation Report (DBIR) from Verizon, targeted malware is not just extracting passwords from disk locations but directly from RAM and other temporary storage locations. From page 22 of the DBIR, “the transient storage of information within a system’s RAM is not typically discussed. Most application vendors do not encrypt data in memory and for years have considered RAM to be safe. With the advent of malware capable of parsing a system’s RAM for sensitive information in real-time, however, this has become a soft-spot in the data security armour”.

As NIST observes, many passwords and password hashes are transmitted over internal and external networks to provide authentication capabilities between hosts, and the main threat to such transmissions are sniffers. Sniffers today are quite sophisticated, capable of extracting unencrypted usernames and passwords sent by common protocols such as Telnet, FTP, POP and HTTP. NIST states that mitigating against sniffing is relatively easy, beginning with encrypting traffic at the network layer (VPN) or at the transport layer (SSL/TLS). A more advanced mitigation is to use network segregation and fully switched networks to protect passwords transmitted on internal networks. But let’s not forget that passwords can also be captured at source by key loggers and other forms of malware, as noted by NIST and the DBIR.

Guessing and Cracking

NIST then moves on to a discussion of password guessing and cracking. By password guessing NIST means online attacks on a given account, while password cracking is defined as attempting to invert an intercepted password hash in offline mode. Password guessing is further subdivided into brute force attacks and improved dictionary attacks. The main mitigation against guessing attacks is mandating appropriate password length and complexity rules, and reducing the number of possible online guessing attempts. Restricting the number of guesses to a small number like 5 or so is not a winning strategy.

The main mitigation against password cracking is to increase the effort of the attacker by using salting and stretching. Salting increases the amount of storage to invert a password hash by pre-computation (for example using rainbow tables), while stretching increases the time to compute a password guess. Stretching is not a standard term as far as I know, and it is more commonly referred to as the iteration count, or more simply, as password spin.

Complexity

Next is a discussion of password complexity, and the size of various combinations and length and composition rules as shown in the table below (double click to enlarge). Such computations are common, but security people normally take some pleasure in seeing them recomputed. NIST observes that in general the number of possible passwords increases more rapidly with longer lengths as opposed to permitting additional characters sets.

image

Of course, the table above does not take into account user bias in password selection. A large fraction of a password space may effectively contain zero passwords that will be selected by a user. And password cracking tools, like the recently upgraded LC6, made their name on that distinction. NIST briefly mentions the issue of password entropy but not in any systematic manner. There is a longer discussion on password entropy in another NIST publication. NIST does suggest several heuristics for strong password selection and more stringent criteria for passwords chosen by administrators.

Get a CPO

The NIST guidelines go on further to discuss some strategies for password reset and also provides an overview of existing password enterprise solutions. A useful glossary is presented as well. Overall the issues surrounding password management are complex and involved, and NIST gives good guidance on the main issues. It would appear that large companies which run big heterogeneous IT environments will require the services of a CPO – Chief Password Officer – to keep their password management under control and within tolerable risk limits.

Saturday, June 13, 2009

How to Choose a Good Chart

There is a nice 1-page guide to chart selection on Scribd as shown below. Seriously, I can't emphasize enough what a resource I find Scribd to be.

Choosing a Good Chart Choosing a Good Chart Mark Druskoff This fantastic chart was produced by Andrew Abela. Here's a link to the original post in 2006 where he debuted his creation. Whole site is worth checking out:
http://extremepresentation.typepad.com/blog/2006/09/choosing_a_good.html

Wednesday, June 10, 2009

Paper Now Available - The Cost of SHA-1 Collisions reduced to 2^{52}

Australian researchers Cameron McDonald, Philip Hawkes and Josef Pieprzyk recently announced at the Eurocrypt 2009 conference a new attack to find collisions in SHA-1 requiring only 2^{52} operations. A premilimary version of the paper is now available here on the eprint service of the IACR.

This new result decreases the cost of a collision attack by a factor of over 2000 as compared to previous methods. The researchers note that “practical collisions are within resources of a well funded organisation”. An article by the Register provides some more background.

Tuesday, June 9, 2009

The Long Tail of Life

In performing risk assessments we are often asked (or required) to make estimations of values. Typically once a risk is identified it needs to be rated for likelihood (how often) and severity (how bad). The ratings may be difficult to make in the absence of data or first-hand experience with the risk. We often therefore rely on “guesstimates”, calibrated against similar estimates by other colleagues or peers.

Here is a question to exercise your powers of estimation.

I recently came across an article by Carl Haub of the US Population Reference Bureau which seeks to answer the following question - How many people have ever lived? Put another way, over all time, how many people have been born? Haub says that he is asked this question frequently, and apparently there is something of an urban legend in population circles which maintains that 75% of all people who had lived were living in the 1970s. This figure sounds plausible to the lay person since we believe most aspects of the 20th century are characterised by exponential growth.

Haub sought to debunk this statement with an informed estimate. He observes that any estimate of the total number of people who have ever been born will depend basically on two factors: (1) the length of time humans are thought to have been on Earth and (2) the average size of the human population at different periods. Haub assumes that people appeared about 50,000 years ago, and from then till now he creates ten epochs (benchmarks) characterized by different birth rates

image

The period from 50,000 B.C. till 8,000 B.C., the dawn of agriculture, is a long struggle. Life expectancy at birth probably averaged only about 10 years for this period, and therefore most of human history. Infant mortality is thought to have been very high — perhaps 500 infant deaths per 1,000 births, or even higher. By 1 A.D. the population had risen to 300 million which represents results a meagre growth rate of only 0.0512 percent per year.

By 1650, world population rose to about 500 million, not a significant increase over the 1 A.D. estimate. Numbers were kept in check by the Black Plague which potentially killed 100 million people. By 1800 however, the world population had passed the 1 billion mark, and has increased rapidly to the current 6 or so billion.

The graph below shows another analysis which corroborates the estimates of Haub. The curve exhibits a long tail or power law with a steep increase in population from about the 18th century.

image

Looking at the curve above you may be tempted to believe the urban legend that 75% of all people ever born were in fact alive in the 70s. Haub in fact reports that in 2002 just under 6% of all people ever born were in fact living. Or put another way, approximately 106 billion people had been born over all time, of which 6 billion are currently living.

The key to this conclusion is the extremely high birth rate (80 per thousand) required to keep humans from becoming extinct between the period of 50,000 B.C. and 1 A.D. According to Haub there had been about 46 billion births by 1 A.D. but only a handful of people had survived. That is, the vast majority of people who have been born have also died.

How good was your estimate to the original question?

Interestingly, WolframAlpha returns the correct answer to the question. I was closer to 30% – 40% of all people being alive today, certainly no where near 6%.

Sunday, June 7, 2009

Technical overviews at Compass Security

Compass Security is a Swiss security firm specialising in ethical hacking and penetration testing. Amongst other offerings, they have a long list of freely available quality publications that present technical overviews of security attacks and tools as well as many other technologies from a security point of view. There are probably 100 or so articles published from last year back until 2004. I came across the site last year when I was researching USB security and found the Compass paper on U3 Insecurity.

Please note that some of the publications are in German only.

Saturday, June 6, 2009

Great Security Mind Maps at MindCert

I often use mind maps to organise my thoughts for longer posts, with FreeMind being my tool of choice, some of which I have published in the FreeMind gallery using their Flash plug-in.

I found some great security mind maps at MindCert, a company committed to “bring[ing] you the best in Mind Mapping with a view to gaining knowledge in the IT Industry and passing IT Certifications from Cisco, CISSP, CEH and Others”. The MindCert homepage currently has mind maps for Certified Ethical Hacking, Cracking WEP with Backtrack 3 and Cryptography for CISSP (plus others)

Googling “security mind maps” brings up a wealth of links. So it seems many people are trying to understand security in terms of mind maps.

Thursday, June 4, 2009

A Risk Analysis of Risk Analysis

The title of this post is taken from a both sobering and sensible paper published last year by Jay Lund, a distinguished professor of civil engineering at the University of California (Davis), who specialises in water management. The paper presents a discussion of the merits of Probabilistic Risk Assessment (PRA), which is a “systematic and comprehensive methodology to evaluate risks associated with a complex engineered technological entity”. PRA is notably used by NASA (see their 320 page guide) as well as essentially being mandated for assessing the operational risks of nuclear power plants during the 80’s and 90’s.

Professor Lund’s views are derived from his experiences in applying PRA to decision-making and policy-setting for engineering effective water management, as well as from teaching PRA methods. His paper starts with two propositions: (1) PRA is a venerated collection of mathematically rigorous methods for performing engineering risk assessments, and (2) PRA is rarely used in practice. Given the first proposition he seeks to provide some insight into the “irrational behaviour” that has lead to the second proposition. Why don’t risk assessors use the best tools available?

Discussions on the merits of using modeling and quantitative risk analysis in IT Security flare up quite regularly in the blogosphere. Most of the time the discussions are just storms in HTML teacups – the participants usually make some good points but the thread rapidly peters out since both the detractors and defenders typically have no real experience or evidence to offer either way. So you either believe quant methods would be a good idea to use or you don’t. With Lund we have a more informed subject who understands the benefits and limits of a sophisticated risk methodology, and has experience with its use in practice for both projects and policy-setting.

Know Your Decision-Makers

After a brief introduction to PRA, Lund begins by providing some anecdotal quotes and reasoning for PRA being passed over in practice.

People would rather live with a problem that they cannot solve than accept a solution that they cannot understand.

Decision-makers are more comfortable with what they are already using. As I was once told by a Corps manager, “I don’t trust anything that comes from a computer or from a Ph.D.”

“Dream on! Hardly anyone in decision-making authority will ever be able to understand this stuff.”

PRA is too hard to understand. While in theory PRA is transparent, in practical terms, PRA is not transparent at all to most people, especially lay decision makers, without considerable investments of time and effort.

So the first barrier is the lack of transparency in PRA to the untrained, who will often be the decision-makers. There is an assumption here that risk support for decisions under uncertainty can be provided in the form of concise, transparent and correct recommendations – and PRA is not giving decision makers that type of output. But I think in at least some cases this expectation is unreasonable. For some decisions there will be a certain amount of inherent complexity and uncertainty which cannot be winnowed away for the convenience of presentation. I am not sure, for example, to what extent the risks associated with a major IT infrastructure outsourcing can be made transparent to non-specialists.

The next few comments from Lund are quite telling.

People who achieve decision-making positions typically do so based on intuitive and social skills and not detailed PRA analysis skills.

Most decisions are not driven by objectives included in PRA. Decision-makers are elected or appointed. Being re-elected can be more important than being technically correct on a particular issue. Empirical demonstration of good decisions from PRA is often unavailable during a person’s career.

So decision-makers are usually not made decision-makers based on their analytical skills, and what motivates such people may well be outside of the scope of what PRA considers “useful” decision criteria. Actually developing a methodology tailored to solving risk problems, in isolation to the intended decision-making audience, is counter-productive.

And here is the paradox as I see it

A poorly-presented or poorly-understood PRA can raise public controversy and reduce the transparency and credibility of public decisions. These difficulties are more likely for novel and controversial decisions (the same sorts of problems where PRA should be at its most rigorous).

So for complex decisions that potentially have the greatest impact in terms of costs and/or reputation, in exactly the circumstances where a thorough risk assessment is required, transparency rather than rigour is the order of the day.

Process Reliability

Lund notes that PRA involves a sequence of steps that must each succeed to produce a reliable result. Those steps are problem formulation, accurate solution to the problem, correct interpretation of the results, and then proper communication of the results to stakeholders or decision-makers. In summary then we have four steps: formulation, solution, interpretation and communication. He asks

What is the probability that a typical consultant, agency engineer, lay decision-maker, or even a water resources engineering professor will accurately formulate, calculate, interpret, or understand a PRA problem?

He makes the simple assumption that the probability of each step succeeding is independent, which he justifies by saying that the steps are often segregating in large organizations. In any case, he presents the following graph which plots step (component) success to overall success.

image

Lund describes this as a sobering plot since it shows that even with a 93% success at each step then the final PRA is only successful with 75%. When the step success is only 80% then the PRA success is just 41% (not worth doing). We should not take the graph as an accurate plot but rather to show the perhaps non-intuitive relation between step (component) success and overall success.

A Partial PRA Example

Lund also describes an interesting example of a partial PRA, where deriving a range of solutions likely to contain the optimal solution to support decision-making is just as helpful as finding the exact optimal solution. The problem he considers is straightforward: given an area of land that has a fixed damage potential D, what is the risk-based optimal height of a levee (barrier or dyke) to protect the land which minimizes expected annual costs? The graph below plots the annual cost outcomes across a wide range of options.

image

There are three axes to consider – one horizontal (the levee height), a left vertical (annual cost) and a right vertical (recurrence period). Considering the left vertical at a zero height levee (that is, no levee), total annual costs are about $850 million or the best part of a billion dollars damage if left unaddressed. Considering the right vertical, for a 20m levee, costs are dominated by maintaining the levee and water levels exceeding the levee height (called an overtopping event) are expected less than once per thousand years.

The recurrence period states that the water levels reaching a given height H will be a 1-in-T year event, which can also be interpreted as the probability of the water level reaching H in one year is 1/T. For a levee of less than 6m in height there is no material difference between the total cost and the cost of damage, which we can interpret as small levees being cheap and an overtopping event likely.

At 8m - 10m we start to see a separation between the total and damage cost curves, so that the likelihood of an overtopping event is decreasing and levee cost increasing. At 14m, levee costs are dominant and the expected annual damage from overtopping seems marginal. In fact, the optimal solution is a levee of height 14.5m, yielding a recurrence period for overtopping of 102 years. Varying the levee height by 1m around the optimal value (either up or down), gives a range of $65.6 - $66.8 million for total annual costs. Lund makes some excellent conclusions from this example

a) Identifying the range of promising solutions which are probably robust to most estimation errors,

b) Indicating that within this range a variety of additional non-economic objectives might be economically accommodated, and

c) Providing a basis for policy-making which avoids under-protecting or over-protecting an area, but which can be somewhat flexible.

I think that this is exactly the type of risk support for decision-making that we should be aiming for in IT Risk management.

Last Remarks

The paper by Professor Lund is required reading at only 8 pages. PRA he surmises can be sub-optimal when it has high costs, a potentially low probability of success, or inconclusive results. His final recommendation is to reserve its use to situations involving very large expenditures or having very large consequences, large enough to justify the kinds of expenses needed for a PRA to be reliable. Note that he does not doubt PRA can be reliable but you really have to pay for it. In IT risk management I think we have more to learn from Cape Canaveral and Chernobyl than from Wall Street.

Wednesday, June 3, 2009

Another vote of confidence for Whitelisting

McAfee recently announced its successful acquisition of Solidcore Systems, a provider of dynamic whitelisting technology. McAfee now claims to have “the first end-to-end compliance solution that includes dynamic whitelisting and application trust technology, antivirus, antispyware, host intrusion prevention, policy auditing and firewall technologies” .

I recently posted on the undeniable logic of whitelisting. Malware writers are shifting away from the mass distribution of a small number of threats to the micro distribution of millions of targeted threats. This strategy is eroding the effectiveness of classic malware detection through signature analysis. As Edward Brice of Lumension remarks, the “block and tackle” approach of blacklisting is not sustainable.

Partially to defuse whitelisting hype, and also to preserve current revenue models, we can expect to see a transition towards hybrid offerings that combine black- and whitelisting, such as the teaming up of Bit9 and Kaspersky. McAfee states that

As businesses seek to more easily mitigate the risks associated with vulnerable or malicious applications downloaded by employees, they will extend beyond signature-based anti-malware with the addition of dynamic whitelisting and application trust technology.

Whitelisting is poised to become the protagonist in the cast of end-point security technologies.

Tuesday, June 2, 2009

The Boys are Back in Town – the return of L0PHTCRACK

The famous password cracking tool L0PHTCRACK (or LC for short) is now back on the market, upgraded to LC6. The tool was acquired by Symantec as part of its purchase of @Stake in 2004, who had developed the tool for audit and consulting engagements.

Symantec ended formal support for LC in 2005. However, a clause in the acquisition contract gave the developers an option to reacquire the tool in such circumstances, and several former members of the hacker high priest group L0pht Heavy Industries have exercised that option. The new site for LC6 is here, where it is marketed as a password audit and recovery tool.

The basic version of LC6 at $295 (called Professional) includes password assessment & recovery, dictionary and brute force support, international character sets, password quality scoring, Windows & Unix support, remote system scans and executive reporting. The Professional version supports up to 500 accounts. Other versions at double or four times the cost support an unlimited number of accounts, installed on multiple machines.

Note that LC6 is subject to United States export controls.

Monday, June 1, 2009

Going, Going … Gone! Auctioning IT Security Bugs

clip_image002

One day in 1947 the Mark computer at the Harvard Computer Laboratory failed, and the cause was traced back to a bug (specifically a moth) found in one of its electromechanical relays. It was already common for engineers to attribute inexplicable defects to “bugs in the system”, but the Mark team had now found real evidence of one of these mythical bugs. The bug was taped into the Mark log book entry for posterity, and soon all computer defects were referred to as bugs. While the Mark story has passed into computer folklore, the bugs themselves have not.

Most bugs today occur in software, and part of the reason for this is sheer size. Windows XP has 40 million lines of code, which if printed out on A4 paper would stretch for over 148 kilometres when laying the pages end-to-end. If you wanted to check Windows Vista, you would have a further 37 kilometres of reading, and a further 180 kilometres for the Mac operating system. When software becomes this large and complex it is not surprising that some bugs are introduced.

Security bugs, also referred to as vulnerabilities, potentially permit an attacker to bypass security mechanisms and gain unauthorized access to system resources. Security bugs normally do not interfere with the operation of software and typically only become an issue when a bug is found and further exploited in an attack. The Zotob worm that infected many high profile companies in late 2006 exploited a vulnerability that had been undetected since the release of Windows 2000. In May 2008 a 25-year old bug in the BSD file system was discovered and fixed.

How are vulnerabilities found? Well software vendors uncover many bugs themselves, but a significant number of vulnerabilities are also found by third parties such as security software specialists, researchers, independent individuals and groups, and of course, hackers. Software vendors encourage these bug hunters to report their findings back to them first before going public with the vulnerability, so that an appropriate patch can be prepared to limit any potential negative consequences. In the computer security industry this is called “responsible disclosure”. For bug hunters that abide by this principle, the software vendors are normally willing to give kudos for finding vulnerabilities, and in some cases, also provide a nominal financial reward.

Over time bug hunters have started to become more interested in the financial reward part rather than the kudos. For example, in August last year a lone security researcher was asking €20,000 for a vulnerability he had found in a Nokia platform. Was this a good deal? What monetary value do you place on finding such a vulnerability? Software vendors want to produce patches for vulnerabilities in their products as soon as possible, while hackers want to produce code exploiting a known vulnerability as soon as possible. Considering these alternatives, how much should the bug hunter be paid for his or her services? Are low financial rewards from software companies driving bug hunters into the arms of hackers? Perhaps a market could efficiently create a fair price.

Charlie Miller certainly thinks so. The ex-NSA employee described his experiences in selling vulnerabilities in an interesting paper, and you can read a quick interview at Security Focus. Miller found it hard to set a fair price and had to rely significantly on his personal contacts to negotiated a sale – contacts that are likely not to be available to most security researchers.

In 2007 a Swiss company called WSLabi introduced an online marketplace where security bugs can be advertised and auctioned off to the highest bidder. This latest experiment in security economics appears to have lasted about 18 months and the WSLabi site is now unavailable. The marketplace started running in July 2007, and in the first two months 150 vulnerabilities were submitted and 160,000 people visited the site. Of those sales that have been publicly disclosed, the final bids have been between 100€ to 5100€. An example list of advertised vulnerabilities on the marketplace is shown below.

image

The WSLabi marketplace was controversial since many security people believe that hackers, or more recently, organized digital crime, should not have early access to vulnerabilities. In 2005 eBay accepted a critical vulnerability in Excel as an item to be auctioned, but management quickly shut down the auction saying that the sale would violates the site's policy against encouraging illegal activity. "Our intention is that the marketplace facility on WSLabi will enable security researchers to get a fair price for their findings and ensure that they will no longer be forced to give them away for free or sell them to cyber-criminals," said Herman Zampariolo, head of the auction site. In another interview Zampariolo explains that "The number of new vulnerabilities found in developed code could, according to IBM, be as high as 140,000 per year. The marketplace facility on WSLabi will enable security researchers to get a fair price for their findings and ensure that they will no longer be forced to give them away for free or sell them to cyber-criminals" .

Unfortunately the marketplace experiment was unexpectedly cut short. In April 2008 the WSLabi CEO Roberto Preatoni blogged about his regrettable arrest and the damage this had on the trust of WSLabi as a broker, notwithstanding being Swiss! Six months later in a public interview Preatoni remarked that "It [the marketplace] didn't work very well … it was too far ahead of its time". Apparently security researchers recognised the value of having an auction site like WSLabi, but very few buyers proved willing to use the site. Buyers seemed to value their privacy more than they value a fair price. Preatoni is now directing his company’s efforts to using zero-day threat information for the development of unified threat hardware modules – which you can read about on the last WSLabi blog post from October 2008.

Sunday, May 31, 2009

Two Monthly Blogging Milestones

Just a short note to say that May 2009 was my first month with more than 1,000 visitors and over 2,000 page views. A big step forward for my blog – thank you all!  It seems that I owe a lot of the additional attention this month to my post announcing a new attack on SHA-1.

image

Sunday, May 17, 2009

The Sub-Time Crisis in Web 2.0

A few months ago I came across Steve Rubel's post on Attention Crash, where he predicts an imminent bursting of the Web 2.0 information bubble since there is no Moore's Law of exponential growth for our attention. This was a provocative remark, which I worked through more thoroughly in Moore's Lore and Attention Crash. Actually, I spent a few days off my laptop scribbling down various ideas and observations on paper, whatever came to mind. I surprised myself with filling 12 pages in a few hours on a Sunday afternoon and a few train rides. Some of the smaller asides will find their way into other posts, like the The Restaurant at the End of the Web. Here I will continue with the Attention Crash theme.

The worst case scenario for Web 2.0 is that we are heading for a singularity, precipitated by dividing our attention into informational units effectively rated at zero content. Using available streaming and aggregation tools we can quickly and effortlessly create an overwhelming flood of information. It is clear that we are building social computing structures that we do not comprehend. Social computing joins two areas that have the potential for massive scaling - computing technology and social interaction. The Sub-Time crisis lies at this intersection. In fact it is the intersection.

After reading Rubel's original post I subscribed to his blog and even his FriendFeed, and soon found that while he is concerned about Attention Crash, he is also a major contributor to the pot of Web 2.0 tidbits (links, comments, posts, pictures and so on). He is typical of other Web 2.0 luminaries who produce content but then also produce “content” on there being too much content.

But is there really a coming Attention Crash? Well certainly not informational equivalents to the Great Crash of 1929 or the Subprime crisis we are currently experiencing. If there is such a thing it will be very unevenly distributed and localised. It is clear that certain people are going to get rain on their personal web parades, but this is not global deluge.

When I was a boy my father regularly asked me “stimulating” questions. He belonged to the generation of Australian men for whom intense argument over inconsequential matters was considered both relaxing and invigorating. Apart from the perennial favourite of "How long is a piece of string?", usually recounted when someone asked how much does a reasonable quality car cost (I come from a long line of car people), and "Did you know you can sink a tractor in a lake of average depth of 1 inch?", he did ask me one time "What is the difference between a crash and a collision?" And the answer is that a collision involves two moving objects (say two cars) while a crash involves one moving object and one stationary object (say a car hitting a tree).

So when we speak of Attention Crash we may think of our information intake hitting up against our stationary capacity to absorb it. This capacity varies from person to person, but there are always hard bounds.

But as with the Great Crash, Rubel is probably using Attention Crash to mean a general or systemic collapse - an event that is widespread, foundational and severe. But the coming Attention Crash will not be of this form - there will be enough localised warnings. Unlike our current financial structures, the web is not at threat of collapsing, though many foot soldiers will fall by the way (they see themselves as pioneers, but in fact they are easily replaced). Our informational structures are not hierarchical but relational, and as such, are much more resilient to the removal of individuals. It is not the case that there are eager underlings waiting to replace leaders – the underlings are here and functioning already.

Web 2.0 losses will largely go unnoticed. New users/readers, whose information experience begins today, are being added constantly. They are essentially unaware of what happened last month and will remain that way. Joining is not a generational wait, no corporate ladder to be climbed. Everyone joins and progresses simultaneously. This turnover goes largely unnoticed since leavers are dwarfed by joiners.

Returning to Rubel, his tips on handling Web 2.0 overload are not very helpful - know what you want, be selective, apply the same ruthlessness that you do to your inbox. In short, organise your way out of the Web 2.0 glut. But this advice is nothing much more than platitudes. If at some point you used to receive 20 emails a day, whatever method you used to process those messages is probably not going to help you when you start receiving 50, 100 or 200 emails a day. But perhaps we are not really talking about scale here. If you buy a car for 4 people and then try to transport 50, this is not a matter of scale but just simply miscalculation.

The tools we have are simply the wrong ones. But ironically we seem to rely on web 2.0 to solve the problem it has created. This is surely like hoping that the fast food industry will take steps to help their customers lose weight. The point is that we face a disinterested informational adversary, that for the foreseeable future operates in a scale-free environment. In the information battle we have some sharp spears with occasional impressive horseback riding. However our adversary has the power to create a data mushroom cloud. Your best fallout shelter is abstinence.

Saturday, May 16, 2009

Rethinking Thresholds for Account Lockouts

One of my colleagues informed me that Conficker caused quite a few password lockout of administrator accounts at his company. The worm used a list of about 200 strings to perform a quick password-guessing attack on privileged accounts. Unless an account had no lockout policy set, then such accounts would be either compromised or locked out by Conficker. We can add DoS to the list of Conficker achievements.

But its not just Conficker that is locking users out of their accounts – in fact, users can do that all by themselves. We all know that quite a few help desk calls are simply requests for password resets of infrequently used applications, or even for frequently used applications where our recall has inexplicably failed us. NetWrix estimates that 30% of all help desk calls are for password resets. One is tempted to think that security policy is more concerned with keeping help desk people occupied than realistically addressing password guessing attacks.

Automatic account lockout is a counter-measure (or control in modern security parlance) for detecting and preventing password guessing attacks. Setting the account lockout threshold to be a small number such as 3 or 5 attempts is part of the conventional wisdom of security. The less number of permitted password guessing attempts the better. But we need to strike some balance here between our own unreliable memories and the persistence of hackers.

As is often the case, a little notation will help the discussion. Let’s assume that a policy defines N possible passwords, which we may represent as

A given user U will select these passwords according their personal preferences, and let the probability distribution

denote these preferences. Lastly let’s assume for simplicity that the passwords are ordered such that

which just means that password P1 is the mostly likely choice of the user with probability p1, password P2 is the next most likely choice with probability p2, and so on.

Now if we have a 3-strikes-you’re-out lockout policy then what does this mean in terms of our probabilities? Well, assuming the attacker follows the preferences of the user, then the policy states that we are prepared to live with three password guesses with a success of

CodeCogsEqn

but not with four password guesses with a success of

CodeCogsEqn

So the critical value here is p4 since it tips the scale from what is acceptable to what is not acceptable. We can represent this state of affairs in the diagram below.

image

But are our password policies really so brittle that we cross a security threshold from allowing 3 to 4 password guesses? I don’t think so. There is a threshold but it is certainly higher than 3 guesses.

I recently blogged about the approach taken by NIST to this issue. Their formulation of the problem was to find the smallest value of M for which

latex2png.2.php

which leads to the following graph

image

That is, the NIST approach is to tolerate M password guesses as long as the likelihood of success is less than the threshold 2^{-k}. The particular values chosen for NIST were k = 10 (1 in 1,000) and k = 14 (1 in 16,000), depending on the desired level of security. It is challenging to compute the exact value of M but NIST has some estimates based on deploying policies which guarantee a minimum amount of entropy in user passwords (see Appendix 1 of this document).

AES-256 and Reputational Risk

Reputational risk is something that everyone understands, particularly businesses who regard their brand as one of their most critical assets. As there is considerable trust in the security of AES-256 both in the public and commercial sectors, reputational risk to AES-256 has a very high impact, and we therefore hope, a very low likelihood of occurrence.

But I will argue in this post that the likelihood of reputational damage to AES-256 is far from low, and perhaps even quite high. AES-256 keys are so large that it is next to impossible to argue that a cryptanalyst will not be able to find some shortcut to recovering keys that saves time over exhaustive search – simply because the cryptanalyst has so much time to play with.

For example, an attack that requires 2^{200} operations is a “huge” saving over the 2^{255} operations for a full exhaustive search of a 256-bit key space. But is a cryptanalyst with a computational arsenal of 2^{200} operations any more of a credible threat than one armed with 2^{255} operations? I think not. But the 2^{200} attack will seem quite menacing since it “saved” a massive 2^{55} operations, and there will be reputational damage to AES-256.

In short 256-bit keys are simply not defensible within the framework of traditional notions of security. I have been intending to post on this topic for sometime, and thanks to Eric Rescorla’s post for prompting me to action.

Let's start by talking about absolute and relative breaks on ciphers. Absolute breaks (like the name suggests) signal the end-of-life for a cipher, while relative breaks signal improvements over exhaustive search. Reputational risk for a cipher starts gaining momentum in the latter case.

Absolute Breaks

By an absolute break we mean, for a family of ciphers or for a cipher at a given key length, a known method to efficiently recover keys. We can measure efficiency in several ways. RSA, like other public key systems, represents a family of ciphers since key generation can be scaled to an arbitrary size. RSA keys can have lengths of 512, 1024 or 2048 bits, and even larger lengths in principle. In practice computational cost is the limiting factor that prevents public keys from becoming arbitrarily large. So to break a public key cipher in some absolute sense we require an attack that works efficiently for all potential key sizes. Or put anther way, an absolute break prevents the cipher designer from simply escaping the attack by increasing the key length.

Many of the early public key systems based on the knapsack problem (including the famous Merkle-Hellman system) were broken efficiently for all key sizes by applying the LLL basis reduction algorithm (you can read a detailed history here). Knapsack ciphers have essentially disappeared from the cryptographic landscape, as their reputation has been damaged beyond repair. Another example of an absolute break is Shor's algorithm, which assuming a large scale quantum computer, can factor an RSA N-bit modulus in time proportional to N^3 operations. Since doing an RSA decryption today requires N^3 bit operations we can say that Shor's algorithm breaks RSA in time proportional to what it costs to setup up an SSL session today. Now we may be very far away from having quantum computers, but even the remote threat of such a device has caused enduring reputational damage for popular public key ciphers in use today.

Absolute Breaks for Symmetric Ciphers

The other way to demonstrate an absolute break of a cipher is to show that a fixed key of a given length can be recovered in a short amount of elapsed time. So the 56-bit key length of DES was conclusively shown to be too short in 1998 by the EFF DES Cracker device which demonstrated key recovery in 56 hours (a bit an hour!). When we have a symmetric key cipher with a fixed key length, then this is what an absolute attack usually means – key recovery in a few weeks or months (or even days) when the expected time was thousands of years. When such an attack is successful the cipher is deemed broken or we must move to a variant, or an entirely new cipher with a longer key length.

Relative Breaks

Well-designed ciphers will typically not succumb to absolute breaks (barring a breakthrough attack), but even the best ciphers have some blemishes which reduce the effort of brute forcing key recovery. By a relative break we mean an attack which is faster than exhaustive search of the key space (the set of all possible keys), but not does break the cipher in any absolute sense. Relative breaks show that the implied margin of security for the key length is not as large as we thought. But what is the implied margin of security?

When a symmetric key cipher has a key length of n bits this is interpreted to mean that, in the absence of any weaknesses, the security of the cipher is proportional to the cost of a brute force attack on a key space of size 2^{n}. That is, a needle-in-a-haystack search problem where there are of 2^{n} pieces of hay. In the worst case an attacker will have to try all possible keys to find the right one (he guesses the right key last) but on average he will need only examine half the possible keys since

equation5

So we lose one bit of security from the key length in the average case of exhaustive search. Many attacks on symmetric ciphers are of the relative type, identifying a weakness which can be translated into some computational saving over exhaustive search. Let's translate the saving into R bits over the average case of exhaustive search

The question here is how large should R be before we think a relative attack is worth worrying about?

In the case of DES the complementary property gave us R = 1 for free. In the 70's, when DES was essentially the only commercial cipher in town, shaving off just a few bits was significant. This was true because many people thought that a 56-bit key was already teetering on the brink of feasible recovery, and saving a few more bits would definitely tip the scale towards an absolute break. But while saving say 5 bits for DES was significant, the same 5 bits of cryptanalytic advantage means less for a 128-bit cipher and practically nothing for AES-256. Nothing more than a mosquito hitting a bullet train.

Relative breaks can point to small cracks in the design of a cipher which can potentially be developed into absolute breaks. This is what we are now seeing with SHA-1 where the idea behind an initial saving in the cost of finding collisions of just a few bits has been developed into serious attacks. The relative is on the verge of being converted to the absolute.

This is the period of reputational damage for a cipher, since for a given relative attack we cannot be certain if it will be converted into an absolute break over time. In February last year a new attack on the 64-bit key space of the A5/1 cipher used in GSM devices was announced, with some fanfare, at a Black Hat conference. The attack claimed to provide a relative break by pre-computing a special “rainbow table” of size 2^{58}, but the promised details have yet to materialize.

Relative breaks for AES-256

Let us now turn specifically to AES-256. AES-256 should be secure against an attacker who has materially less than 2^{255} = 10^{76} resources at their disposal. Since 10^{76} is to within a few powers of 10 of the estimated number of atoms in the known universe, then it is essentially impossible for AES-256 to succumb to pure computational attacks on its key space.

So ruling out absolute breaks (in the absence of discovering a major weakness in AES), this only leaves relative breaks for consideration. Again, how big should the relative computational saving R be before we have a serious or material relative break?

Well let’s be generous and assume that when R = 55 we have a material relative break. This means that the cryptanalyst has saved a computational effort on the order of recovering a DES key. With this saving the cryptanalyst still has 2^{255-55} = 2^{200} resources at their disposal to break an AES-256 key.

You could read the headlines now “AES-256 Broken - researcher knocks off 17 zeroes from exhaustive search”, where 2^{55} is approximately 10^{17}. But would this result really imply a meaningful threat? All that has been done is change an absolutely absurd large computational task to a slightly less absurdly large computational task.

We are ceding too much to the cryptanalyst here by a giving them a massive 2^{200} sandbox for toying around with how to break AES-256. Every scrap of current computing power in the world running for a year would amount to less than 2^{100} operations and we are postulating that our cryptanalyst has access to 2^{100} times that power, or 10^{29} centuries of continuous computation at current rates.

Something is wrong here in the analysis – no one will have 2^200 resources up their sleeve. The problem with a 256-bit key is that it can shed 55 bits and still lose no real effective security. So the reputational risk is that someone emerges from this sandbox with a 2^{200}-idea that has credence.

Managing the reputation of AES-256

Google Trends is a service which estimates search and news activity for a given keyword. For “AES 256” the graph below was produced

image

It is evident that there has been a considerable increase in search and news activity since 2007. The letters in the upper graph are links to news stories, which are mainly product announcements for AES-256 support. Vendors are certainly adopting AES-256, and in a previous post I wrote that

… AES-256 is being widely deployed since it conveniently lies at the intersection of good marketing and pragmatic security. In upgrading from AES-128 to AES-256 vendors can legitimately claim that their products use maximum strength cryptography, and key lengths can be doubled (thus squaring the effort for brute force attacks) for a modest 40% performance hit.

However the only possible security threat which would warrant the deployment of 256-bits keys is the existence of large scale quantum computing devices. If this were the case then there is a quantum algorithm that effectively reduces the key space to 128-bits. But if this is the case, then we will be living in a very different world from the one we inhabit now.

Imagine you posed the following question to a group of top physicists. You asked them to present you with ideas for new research projects where they could assume that the budget included all the money that we have, all the money that has ever been, and the total financial assets of the world for the next 10 million years. Would the resulting proposals be credible?

AES-256 puts cryptanalysts on the same research agenda.


Related Posts

Saturday, May 9, 2009

Password Roundup #2

Password issues remain a regular item in security news over the last few weeks. In this roundup I report on the fruits of password harvesting, a new policy from NIST, a kerfuffle between Elcomsoft and PGP, and lastly, how to pass the hash.

Unveiling Selma Hayek

Graham Cluely reported that the public email account of Selma Hayek was hacked, leading to screen shots of her messages being published along with other private details. Allegedly the hack was achieved by resetting her password after “guessing” her date of birth and the name of her most famous film role. Not exactly examples of “high entropy” information.

Fruits of the Password Harvest

Techradar reports on the exposure of The Great Password Scandal. The incident involved a site called My Name is E that lured new users with the promise to integrate multiple social networks if they handed over enough passwords. And people did, spurred by recommendations from Twitter. But these tweets were actually authored by My Name is E using harvested Twitter passwords via the autotweet function acting as a viral marketing vector. The lead developer from My Name is E claims that this was a development feature which was mistakenly left activated in the production version of the site.

A precedent is being established by well-known social networking sites to request people to supply their email username and password so that their contacts may be automatically added as friends. Sites following this approach include Get Satisfaction, Linked In, Yelp, Plaxo, Ning, FriendFeed, Orkut, iLike, MySpace and Facebook. Users are being sent the message that it’s ok to handover these credentials to simplify your social networking experience. Both Twitter and My Name is E know that OAuth is a better solution but they are not quite there yet. Joining up the social fabric web 2.0 still trumps security.

Some large scale harvesting was also reported by researchers from the University of California Santa Barbara, who infiltrated a botnet for 10 days earlier this year. The researchers have just published a paper on their findings where they report that they were able to access 70 GB of harvested data which included a massive 56,000 passwords.

NIST Password Policies

NIST has released a new draft document called Guide to Enterprise Password Management, SP 800-118. The 38-page document ¨provides recommendations for password management, which is the process of defining, implementing, and maintaining password policies throughout an enterprise. Effective password management reduces the risk of compromise of password-based authentication systems”. You can read a brief review of the document here.

Still with NIST, I recently blogged about their approach to making passwords harder to guess by using entropy arguments. At the E-Authentication site you can download a spreadsheet for calculating the entropy of explicit policies (as well as a few other interesting tools and documents).

The Elcomsoft and PGP Kerfuffle

The Register reported an incident at the recent InfoSec 2009 conference where an Elcomsoft poster with the by-line "the only way to break into PGP" was removed as a result of an official complaint lodged by PGP. Unfortunately the PGP and Elcomsoft vendor stands were facing each other in the exhibit hall. "The sign was factually inaccurate and lies about PGP," said Jon Callas, CTO of PGP. "They're not breaking into PGP, they're doing password cracking. There's a difference”. But naturally, if a password is protecting your PGP private key, then their protection is no stronger than a password. You can read more about he incident (and see the offending poster) on the Elcomsoft blog.

Pass the Hash

At the RSA 2009 conference the Register asked two security experts to rate the world’s most dangerous exploits. Near the top of the list for Ed Skoudis, a well-known security practitioner and author, was a powerful exploit that has evolved from an old attack known as pass the hash. Attackers exploit an unpatched browser or application vulnerability to capture a Windows password hash and then use it to create a valid login session on another machine with someone else’s password (for Windows only the hash of the password is required, not the password itself). For the attack to be successful the hash must be injected into memory and you can read about the details in this post from Marcus Murray.

Related Posts