Google’s Drive To Digital Omnipotence

Google’s Drive To Digital Omnipotence

January 11, 2006: Google made waves last year through its controversial print library project in to scan the collections of five major US libraries and make them freely available online. Now, just as the din from that controversy is dying down Microsoft has jumped into the fray by joining up with the Open Content Alliance (OCA).

Microsoft’s jump has made much less of a splash however, as the OCA is taking a slightly different approach with its library digitisation. The OCA, like Google, plans to build a digital archive of global content with free, universal access for all. However, the alliance, made up of such companies as Yahoo!, Adobe and Internet Archive, focuses on books published before 1923 to avoid copyright complications.

Whilst the OCA is having less visible trouble with copyright issues, both Google’s and the OCA’s plans raise interesting questions regarding data storage and document management. The storage and retrieval implications of both plans are huge. The means to store the data exist, but how exactly do you scan an entire library worth of books, index them, and make them rapidly and efficiently searchable?

For example, Internet Archive’s servers are reportedly capable of storing petabytes of information. Each petabyte storing approximately 100 million pages.

“For quite a few years we’ve had the technology to store all the information we need to store. What is really hampering companies, is the ability to make sense of and manage all the information they have.” says Clive Gold, Product Marketing at EMC Australia. “If you have a terabyte of data for example, how do you sensibly use it to find what you want?”


In an interview preceding his keynote address at the CES in Las Vegas, Bill Gates claimed that IBM is its biggest rival. This is rather perplexing given that IBM concentrates on hardware and Microsoft on software. As Gates points out, IBM employs a lot more people than Microsoft, but there isn’t exactly any danger that IBM will move into the operating system business again (remember OS/2?) and knock Microsoft out of the game.

Surely Google is more of a danger? With its momentous energy and empire-building bank balance, Google has been touted by many in the media as the David to Microsoft’s Goliath. Since its stock went public in August 2004 for example, its market capitalisation has reached roughly $123 billion, almost the same as IBM's market value. And if Steve Ballmer’s reported vitriolic vow that he is going to “F***ing kill Google” is anything to go by, then perhaps Microsoft is trying a new tactic of playing down the upstart.

Microsoft’s move into library digitisation looks to be a signal that it is positioning itself to compete directly with Google in the future. Considering the storage requirements for the project, perhaps it wouldn’t be such a bad idea to pair up with its so-called rival and storage server behemoth IBM.

Comment on this story

Related Article:

Amazon slams book on Google's fingers