Via Kurzweil
AI, check out this modest proposal made at the Web
2.0 conference in San Franciso:
Universal access to all human knowledge could be had for around $260m, a
conference about the web’s future has been told.The idea of access for all was put forward by visionary Brewster Kahle, who
suggested starting by digitally scanning all 26 million books in the US Library
of Congress.In his speech, Mr Kahle pointed out that most books are out of print most
of the time and only a tiny proportion are available on bookshop shelves.He estimated that the scanned images would take up about a terabyte of space
and cost about $60,000 (£33,000) to store. Instead of needing a huge
building to hold them, the entire library could fit on a single shelf.
This is a tremendous idea; and the cost of doing it is only going to go down.
The initial scanning work is the only part of the plan that’s likely to present
much of an expense factor. According to Moore’s
Law, that $60,000 price tag for storage should be somewhere around $2,000
eight years from now. If the estimate for the robot scanner is accurate, and
it follows a less robust drop in price — say halving once every four years
— we would be looking at a price tag of around $65 million in the same
period of time. Pretty doable, I’d say.
Unfortunately, the legal concept of public domain is rapidly
diminishing, while copyright terms are lengthened and controls are made
more expansive. As John Bloom observed
a while back in The New Republic:
In the name of Mickey Mouse and other American icons, we have gradually lengthened
that 14-year limit on copyrights. At one time it was as much as 99 years,
then scaled back to 75 years, then — in one of the most anti-American
acts of the last century — suspended entirely in 1998. The Sonny Bono
Copyright Term Extension Act of that year says simply that there will be no
copyright expirations for 20 years, meaning that everything published between
1923 and 1943 will not be released into the public domain. Presumably they’ll
take up the matter again in 2018 and decide whether any of these books, movies,
or songs are ever set free. There are 400,000 of them.
So Kahle’s observation that few of these books are still on the shelf will
be beside the point. A scanned-in Library of Congress could conceivably serve
as a back-up to the print archive, providing an excellent disaster recover resourse,
but it would probably not be possible to distribute the whole archive. Only
those parts created before 1923.
Of course, there’s hope that, when the copyright issue is reviewed again by
Congress (presumably in 2018) the public will be more aware of what’s going
on and will not stand for any more expansions of copyright controls. Failing
that, maybe we could get an exception to copyright law into place. Perhaps we
could make this backup of the Library of Congress exempt from all copyright
restrictions as long as it’s used by schools and public libraries.
![]() |
By 2018, the storage for a copy of the entire Library of Congress |
