There are 129,864,880 books in the world, Google says

6 Aug 2010

Search giant Google has endeavoured to count how many books there are in the world and as of today estimates there are more than 129,864,880 books across the globe.

Standing on the shoulders of giants — libraries and cataloguing organisations — and based on our computational resources and experience of organising millions of books through our Books Library Project and Books Partner Program since 2004, we’ve determined that number,” software engineer Leonid Taycher said in the Google Books blog.

“As of today, we estimate that there are 129,864,880 different books in the world. That’s a lot of knowledge captured in the written word! This calculation used an algorithm that combines books information from multiple sources including libraries, WorldCat, national union catalogues and commercial providers. And the actual number of books is always increasing.”

Taycher said that since ISBN numbers were only introduced in the 1960s, it had to go further to find how many books have been written.

“We collect meta data from many providers (more than 150 and counting) that include libraries, WorldCat, national union catalogues and commercial providers. At the moment we have close to a billion unique raw records. We then further analyse these records to reduce the level of duplication within each provider, bringing us down to close to 600 million records.

“Does this mean that there are 600 million unique books in the world? Hardly. There is still a lot of duplication within a single provider (eg, libraries holding multiple distinct copies of a book) and among providers – for example, we have 96 records from 46 providers for Programming Perl, 3rd Edition. Twice every week we group all those records into ‘tome’ clusters, taking into account nearly all attributes of each record.

“When evaluating record similarity, not all attributes are created equal. For example, when two records contain the same ISBN this is a very strong (but not absolute) signal that they describe the same book, but if they contain different ISBNs, then they definitely describe different books,” Taycher said.

John Kennedy is a journalist who served as editor of Silicon Republic for 17 years