Thursday, January 29, 2009

Little trust in Corporate Blogs

The chart below shows the results of a recent online survey from Forrester which states that only 16% of respondents trust corporate blogging information. Personal blogs are only slightly higher at 18% (do you believe me?) and newspapers remain quite reliable at 48%. However the clear winner at 77% is personal email from someone you know.

However, as expected, not everyone agrees with the results or what they seem to imply. Trust can still be earned, or lost, no matter who you are.

3094358118_0122b16c5c4

Data Centric Security Model hot on Scribd

My paper on Data Centric Security has made it to the hotlist on Scribd, which means that the paper is getting quite a few hits of late. I blogged about this work in one of my first posts to No Tricks back in September 2007. If you Google the topic further you will find that IBM has taken the idea a lot further since this initial work, and I believe that a whole consulting practice has been established around this concept.

A Data Centric Security Model

Monday, January 26, 2009

Entropy and Anonymity article featured on Scribd

My paper on the entropy of traffic confirmation attacks, which I blogged about here, has been selected as a featured article on Scribd. There have been 120 new hits as a result, which is quite a few more than the orginal post received.

Entropy Bounds for Traffic Confirmation

Friday, January 23, 2009

Moore's Lore and Attention Crash

In his 2007 article on Attention Crash, Steve Rubel predicts an imminent bursting of the web 2.0 information bubble since our attention does not scale in proportion to our inputs

We are reaching a point where the number of inputs we have as individuals is beginning to exceed what we are capable as humans of managing. The demands for our attention are becoming so great, and the problem so widespread, that it will cause people to crash and curtail these drains. Human attention does not obey Moore's Law.

I assume Rubel is referring to the generic notion of scheduled exponential growth which Moore's Law has come to symbolize in the microprocessor industry, and more generally, the computing industry as a whole. The metaphor is more apt than perhaps the author originally intended.

Moore's Lore

Moore's Law, formulated in 1965, states that transistor density will double every 1 - 2 years. By 1970 Moore's Law was being interpreted as computer processing capability doubling every few years - a subtle change in prediction from exponential component decrease to exponential processing increase. The latter knows no bounds while the former operates in the physical world dominated by fundamental constraints and hard limits.

Moore's Law recently celebrated its 40th anniversary with much fanfare in 2005. We can divide this history into 3 epochs which we will call the Narrative, Prophecy and Legacy epochs. During the Narrative epoch Moore's Law provided an accurate description of advances in the microchip industry - graphing the data followed his predicted curve. During the Prophecy epoch, Moore's Law became more of an industry mission statement rather than a narrative. The industry was actively planning and investing to fulfil Moore's Law and it came to symbolize the advancement and promise of the industry as a whole. Ensuring the next doubling of processor speed became mission critical.

In the final epoch of Legacy, our epoch, the assumptions of Moore's Law are in decline. The reason is that chip makers are approaching the fundamental limits of increasing computer processing speeds through crafting smaller components. The hardest limit is simply that component size bottoms out at the molecular or atomic scale - nothing can be made smaller that follows the same physics. Further, issues with power consumption, chip cooling and production costs will invalidate the assumption that smaller components are the most cost effective strategy to increase processing capability. In fact, Mr. Moore's Intel is already delivering dual-core and quad-core processors which increase computing power by providing more chips per device rather than more transistors per chip. The future then lies with more chips of a given complexity not with a chip of increased complexity. So Moore's Law may actually be maintained but not for the reasons that Moore predicted.

Fundamental Attention Limits

When we start out connecting, subscribing or otherwise attaching ourselves to new web information sources, we are able to absorb a great deal of new information. We may even begin participating in the production of additional web content for distribution and consumption. We feel more informed, more empowered, and more enamoured with the promise of the omnipotent web. The web 2.0 narrative has worked its magic and we tacitly commit into a seemingly virtuous circle of information inflation.

After a honeymoon period we start to feel the burden of keeping up appearances on the web. We still subscribe to new feeds, extend our blogroll, sign-up for new services like Twitter, embed ourselves in social networks, fret about our followers and labour over posts. In our peripheral vision the virtuous circle is acquiring a vicious hue. The web 2.0 narrative becomes more belief and prophecy than tangible benefit. But belief and prophecy can be strong motivations and we continue on connecting, subscribing and participating in new web information sources.

After passing through the Narrative and Prophecy epochs, we finally reach the Legacy epoch, where our initial assumptions are invalidated. As with Moore's assumption that computing components can be exponentially decreased for the foreseeable future, the initial assumption that our time and attention can be exponentially fragmented into smaller meaningful units collapses. While this took 40 years in the case of Moore's Law, we live on compressed and accelerating Internet time, and this lesson can be learnt from one birthday to the next.

Unfortunately as humans we don't have the luxury of switching to dual- or quad-core brains. We are stuck with one and its inherent capacity to process at a given rate of input and granularity. A Sub-Time Crisis is upon us and we need a bailout, or at least the presence of mind to opt out.

Related Posts

Thursday, January 22, 2009

The Restaurant at the end of the Web

The Web 2.0 is actively encouraging information browsing habits akin to an "all you can eat" restaurant.

The proprietor guarantees that you won't gain weight and provides you all the food you can eat for free. You just pay your costs to get to the restaurant (if any), and forgo the time you spend eating.

You have a stylish waitress by your side at all times, ready to serve you the next dish or drink. There is no tip, no closing hours, no booking required, and anything you want is on the menu.

If some delicacy appears to be missing, your waitress will take your wishes to the chef out in the kitchen and he will make it to order. If he is baffled, he will ask his other chef friends to help out. They will serve you up their best guess at what you want, and keep trying until you are satisfied. They have all the time in the world.

If you are overwhelmed by choice there is no need for alarm. Your waitress will gladly bring you an endless series of appetizers. In fact, you never need to proceed to a main meal.

And the tables are large enough to invite anyone you know, and if you so desire, anyone you don't know as well.

The only catch is that you need to get through a little advertising material next to your menu, and perhaps chat to a few other customers. You hardly notice the inconvenience.

Twitter as your Personal Content Proxy

Newspapers are currently going through a fundamental restructuring where content is being decoupled from presentation and navigation. Journalists will either need to relabel themselves as essentially independent content producers, or move into the media side of the news service.

News sites will package and deliver content from arbitrary sources that will link away from their sites. As such they will be proxies to content repositories of the web. News sites will have to adjust to the Google model of linking people away from their site in order to get them to visit again.

I wonder if Twitter will end up performing a similar proxy function. The Twitter layer will consist of many personal recommendations (links) to data and content, sent to you personally by people that you trust, respect or share personal interests with. Can any site or search service provide you with comparable navigational assurance?

Twitter has the potential to decouple the web into a personal 2-tier system: a recommendation/link layer pointing to a content layer.

Twitter will be providing the long-lat coordinates into the datascape. Beam down, beam up, next tweet.

Navigation and search are just for people who don't have any friends or colleages.


Scoble's Law of Twitter

Disconnect from Twitter when you are receiving more than one tweet per second.

Over the weekend I have been thinking about how much web (2.0) information we can reasonably absorb, and what hard limits can be used to frame the question.

With respect to Twitter if we assume that a tweet takes 20 seconds on average to read and follow-up, how many can be processed by an individual in a day? That's 3 per minute, 180 per hour, which let's round up to an even 200. So a hardish limit for daily twitter processing would then be 4800 tweets, assuming no sleep.

We can try for a harder (higher) limit by making more optimistic (not realistic) assumptions. If we assume that a tweet only takes 1 second to handle then this yields 86,400 per day assuming no sleep.

I didn't imagine that anyone was actually receiving (let alone processing) tweets at this rate. But in fact, Robert Scoble recently announced that he would have to cut back from Twitter given that he was receiving one tweet one per second from a follower base of over 40,000. Apparently Scoble was spending at least 7 hours every day monitoring Twitter and FriendFeed - which amounts to the equivalent of a full-time (unpaid) job.

This leads to Scoble's Law for Twitter as stated in quotes above.

Robert Scoble, Twitter martyr, died 2 AT (Anno Twitterum).

Jeremiah Yowang may be next.

robert-scoble-11

Related Posts

Saturday, January 17, 2009

Google's Storm in a Teacup

googlekettle

The UK Times recently ran a story on the environmental impact of Google searches. The article asserted that two Google desktop searches (at 7g of CO2 each) had about the same carbon footprint as boiling a kettle for a cup of tea (at 15g of CO2). Following the link to the original story now shows that the article has been clarified - I would like to say corrected after a flurry of criticism, but its not that simple.

Google quickly replied that their estimated cost of a search query was 0.2g of CO2, considerably less than the 7g given by the Times. The authors state that driving an average car for one kilometre produces as many greenhouse gases as a thousand Google searches, under the current regime for EU standard for tailpipe emissions. The Times has later clarified that by search they meant an activity lasting several minutes, and not simply a single Google query. Other people have also produced CO2 estimates for search in the 1g - 10g range, but upon inspection, these estimates also assume search to be an activity that the user performs for 10 to 15 minutes rather than a service provided by Google. Your computer has a 40g - 80g footprint per hour simply due to being turned on and surfing around a bit.

Gartner has remarked that for the first time in 2007 greenhouse gases produced to power the Internet have surpassed the total emissions of the global airline industry. This is quite ominous since we normally don't think of airlines as being particularly clean or the Internet as being particularly dirty. But its all about power consumption produced by burning fossil fuels. Power is mainly used by personal computers, the network and data centres. The largest power consumer here are personal devices, and there is some debate over the order of second and third places. I think the network.

Part of the controversy of the original story was that the 7g search cost was attributed to young Harvard physicist Alex Wissner-Gross. The clarification to the Times article now includes a link to a new article by Wissner-Gross where he gives some opinions on CO2 emissions, but actually few details. Apparently his main work on Internet CO2 consumption is being reviewed by academic referees before formal publication, so we must wait for his detailed analysis. Oddly enough Wissner-Gross, who took umbrage at being attributed as the source for the 7g Google estimate, now openly states that the correct figure is 5g - 10g.

The original statements of the Times, and perhaps even those of Wissner-Gross, can be traced back to the work of Rolf Kersten, who presented a talk called Your CO2 Footprint when using the Internet at a German Ebay conference in 2007. His findings are summarized in the table below.

Web Service

CO2 Emission

One Google search

6.8 g

One EBAY auction

55 g

One blog post on blogs.sun.com

850 g

A SecondLife avatar, 24hrs „alive“ for one year

332 kgs

Yes that is kilograms for the SecondLife avatar! Hr. Kersten has recently changed his estimates in light of the recent events and technology improvements, and stated that he overestimated by a factor of 35:

I was wrong. Very wrong. Wrong by a factor of 35. Wrong even when you take into account that Moore's Law and Google engineers had 20 months to increase efficiency since my first guesstimate.

So now we have it: One Google Search produces as much CO2 as 10 seconds of breathing!

You can review the details of the calculation in a latter article. Hr. Kersten is now in agreement with the 0.2g figure given by Google.

So in the end its just more than 70 Google queries to consume the same energy boiling a a kettle.

Friday, January 16, 2009

Downadup's Password Cracking List

This week it was reported that the Downadup worm (also known as Conficker) has infected 3.5 million Windows machines, according to data gathered by security company F-secure. One the ways the worm tries to propagate is by guessing account passwords on the victim machine.

F-secure has a write-up on the worm which includes the list of passwords that it checks (reproduced below). The list of just over 180 passwords candidates contains the usual suspects - the username for the account, repeated digits, qwerty, admin, password and pass1, pass12, pass123. Given that the worm has successfully infected such a large number of machines, this password guessing stragegy must be quite effective. So weak passwords are still letting us down.

(Added April 2nd, 2009: you can see a nice graphic of this password list at Graham Cluley's blog).
  • [username]
  • [username][username]
  • [reverse_of_username]
  • 00000
  • 0000000
  • 00000000
  • 0987654321
  • 11111
  • 111111
  • 1111111
  • 11111111
  • 123123
  • 12321
  • 123321
  • 12345
  • 123456
  • 1234567
  • 12345678
  • 123456789
  • 1234567890
  • 1234abcd
  • 1234qwer
  • 123abc
  • 123asd
  • 123qwe
  • 1q2w3e
  • 22222
  • 222222
  • 2222222
  • 22222222
  • 33333
  • 333333
  • 3333333
  • 33333333
  • 44444
  • 444444
  • 4444444
  • 44444444
  • 54321
  • 55555
  • 555555
  • 5555555
  • 55555555
  • 654321
  • 66666
  • 666666
  • 6666666
  • 66666666
  • 7654321
  • 77777
  • 777777
  • 7777777
  • 77777777
  • 87654321
  • 88888
  • 888888
  • 8888888
  • 88888888
  • 987654321
  • 99999
  • 999999
  • 9999999
  • 99999999
  • a1b2c3
  • aaaaa
  • abc123
  • academia
  • access
  • account
  • Admin
  • admin
  • admin1
  • admin12
  • admin123
  • adminadmin
  • administrator
  • anything
  • asddsa
  • asdfgh
  • asdsa
  • asdzxc
  • backup
  • boss123
  • business
  • campus
  • changeme
  • cluster
  • codename
  • codeword
  • coffee
  • computer
  • controller
  • cookie
  • customer
  • database
  • default
  • desktop
  • domain
  • example
  • exchange
  • explorer
  • files
  • foobar
  • foofoo
  • forever
  • freedom
  • games
  • home123
  • ihavenopass
  • Internet
  • internet
  • intranet
  • killer
  • letitbe
  • letmein
  • Login
  • login
  • lotus
  • love123
  • manager
  • market
  • money
  • monitor
  • mypass
  • mypassword
  • mypc123
  • nimda
  • nobody
  • nopass
  • nopassword
  • nothing
  • office
  • oracle
  • owner
  • pass1
  • pass12
  • pass123
  • passwd
  • Password
  • password
  • password1
  • password12
  • password123
  • private
  • public
  • pw123
  • q1w2e3
  • qazwsx
  • qazwsxedc
  • qqqqq
  • qwe123
  • qweasd
  • qweasdzxc
  • qweewq
  • qwerty
  • qwewq
  • root123
  • rootroot
  • sample
  • secret
  • secure
  • security
  • server
  • shadow
  • share
  • student
  • super
  • superuser
  • supervisor
  • system
  • temp123
  • temporary
  • temptemp
  • test123
  • testtest
  • unknown
  • windows
  • work123
  • xxxxx
  • zxccxz
  • zxcvb
  • zxcvbn
  • zxcxz
  • zzzzz

Wednesday, January 14, 2009

How to become a Famous Blogger

Made me laugh.

Tuesday, January 13, 2009

Algebraic Group Visualization

There are some quite interesting visualizations of groups at a blog called Alice and Bob in Cryptoland. What the author has done is to add colours to the group table for several mappings so as to give you some idea about the inherent structure of the mapping.

The diagram below represents multiplication modulo 509. Each number 1, 2, ..., 507, 508 is assigned a colour. Then the multiplication table is formed from a 509 x 509 table where the entry in the i-th row and j-th column is (i*j) mod 509, then its value substituted for its defined colour. A definite quadrant structure is present.


A second visualation is given for an elliptic curve group defined over GF(503). This group contains 503 elements, which are assigned colours and then represented in a 503 x 503 addition map (elliptic curves are additive rather than multiplicative as the group above).


The result is far more random-looking mapping, which you would expect since elliptic curves exhibit less inherent structure than numeric groups. You can obtain the code for these graphs from the orginal post.

Monday, January 12, 2009

Social Media Data in Flight and at Rest

Adam Singer over at the TheFutureBuzz blog has published 49 statictics on social media usage - you thought he may have striven to find one more to round it out to 50! He remarks that

As our digital and physical lives blur further, the Internet has become the information hub where people spend a majority of their time learning, playing and communicating with others globally.

Sometimes it is easy to get lose sight of just how staggering the numbers are of people collaborating, researching, and interacting on the web.

Some of the notable statistics include

  • 1,000,000,000,000 (one trillion) - approximate number of unique URLs in Google’s index
  • 2,000,000,000 (two billion) - very rough number of Google searches daily
  • 684,000,000 - the number of visitors to Wikipedia in the last year
  • 112,486,327 - number of views the most viewed video on YouTube has (January, 2009)
  • 133,000,000 - number of blogs indexed by Technorati since 2002
  • 1,111,991,000 - number of Tweets to date (see an up to the minute count here)
  • 150,000,000 - number of active Facebook users
  • 700,000,000 - number of photos added to Facebook monthly

My personal favourite is that it would take someone just over 412 years take to view all the content on YouTube that was available in March 2008. By now, you can probably add a few more centuries.

Some books on Scribd

As I mentioned in my last post, there is a lot of very interesting and detailed content of all types being uploaded to Scribd. According to Wikipedia,

Scribd is a document sharing website. It houses 'more than 2 million documents' and 'drew more than 21 million unique visitors in May 2008, little more than a year after launching, and claims 1.5 million registered users.' The site was initially funded with $12,000 funding from Y Combinator, but has since received over $3.7 million from Redpoint Ventures and The Kinsey Hills Group.

You can even find whole books on the site. Here are some interesting documents that I found from a hour or so of searching

I think I will drop my Safari account as I now have enough reading for far more than foreseeable future. I also uploaded a paper that I co-wrote on Data Centric Security
A Data Centric Security Model


Friday, January 9, 2009

Publishing on Scribd

Scribd is a great publishing site for PDFs and PPTs on a wide variety of topics. I have not visited the site lately, but it seems that the Scribd document repository has reached critical mass, and practically anything can be found there now. In fact, even too much. I have started to put of few documents there myself, and will later post a few of the interesting links I have found.

Shamir's Third Law and other Tales from the Crypt