Down the memory hole (or how I went from man to mouse)

On Sunday I wrote: "It’s never been easier to check quotations". It’s time for an update.

While checking some of my own words on Monday, I discovered that many of my old blog posts had been attributed to Danger Mouse and Admin. A part of my online identity had been sucked down the memory hole.

While it’s easier than ever to check quotes from well known figures like John F Kennedy, Groucho Marx or Winston Churchill, it can be surprisingly difficult to check quotes from bloggers. As John Quiggin notes, some bloggers try to fend off criticism by stealthily correcting their mistakes. And some blogs just disappear.

Online content is more ephemeral than paper and ink. So an interesting question whether a shift away from physical texts to online texts will make checking some sources more difficult.

The trend is away from print. Newspapers like the Christian Science Monitor and the Seattle Post-Intelligencer have abandoned print and moved to online only. And while popular novels and coffee table books will continue to sell well in print, low circulation non-fiction may gradually migrate online. The academic libraries of the future may end up with more silicon and less paper.

As more and more old books are scanned into online databases like Google Books and offered for download by retailers like Amazon, their paper and ink counterparts may begin to disappear. And this may mean we’ll be increasingly reliant on a small number of centrally managed electronic sources rather than on a large number of physical texts.

If people are able to get the books they want from their wirelessly connected laptops, book shops and libraries may find older, less popular books are more trouble than they’re worth.

Digital collections will lead to more aggressive ‘weeding’

In 2001 the University of Western Sydney admitted that it had buried 10,000 books in order to avoid the cost of storing them. But this pales in comparison with claims that the San Francisco Public Library dumped over 200,000 books in the late 1990s.

Almost all libraries ‘weed‘ their collections by disposing of books that are obsolete, damaged or rarely used. And the easier and cheaper it is to download old and less often used books from the internet, the more aggressive weeding policies will become.

Eventually, using electronic texts for research may become the norm. After all, it’s far more convenient to check a few key facts from your laptop than it is to trudge across town only to find that the book you want has been stolen or misplaced.

And with demand for physical books declining, libraries will further restrict their collections. Many little used books will be held only in a handful of university and major public libraries like the National Library of Australia. As a result, it will become increasingly difficult to check electronic versions against the original paper ones.

Missing books and altered text

So what if Google ends up with most of the digitised texts? The Electronic Frontier Foundation’s Fred von Lohmann worries about Google’s ability to delete texts from its collection. "Once a book is removed," he says, "not only won’t you be able to read it online, you won’t even be able to find it using full-text search".

According to von Lohmann, the biggest risk comes from copyright holders. Under the Google book settlement, they are able to ask Google to remove their books from Google’s electronic database. "Even more troubling", he writes, "is the possibility of selective alterations of the texts of the books themselves".

The last library?

At Language Log Geoff Nunberg worries that Google Books "is almost certainly the Last Library":

There’s no Moore’s Law for capture, and nobody is ever going to scan most of these books again. So whoever is in charge of the collection a hundred years from now — Google? UNESCO? Wal-Mart? — these are the files that scholars are going to be using then. All of which lends a particular urgency to the concerns about whether Google is doing this right.

One of Nunberg’s major complaints is that many of the texts held in Google Books are misdated. Almost everyone who uses Google Books’ advanced search has come across this problem. If the date is important, readers should always check the text itself, rather than relying on Google’s metadata.

First they filtered You Tube …

As von Lohmann points out, electronic texts are far easier to alter than those on paper. As a result, the integrity of electronic collections depends on the policies and priorities of those who manage them. Not every library is an archive designed to collect and preserve texts for the future.

For example, on the practice of weeding, Renate Beilharz of the Schools Catalogue Information Service writes:

Students deserve information that is current and up to date. A key purpose of weeding is to rid the collection of inaccurate, outdated and misleading resources. Students are encouraged to use and rely on information provided in the school resource centre. It is essential to provide information that is correct, non-racist or sexist, and that reflects modern knowledge and values.

If schools move to electronic collections it will become much easier for parents, teachers or concerned citizens to identify material they don’t approve of and insist that librarians block or filter access. A quick electronic search may uncover vast amounts of objectionable material that had been allowed to remain undisturbed on library shelves for decades.

Who can you trust?

In the case of my misattributed posts, it’s the National Library of Australia that offers a safety net. At least some of Club Troppo’s and Catallaxy’s older posts are archived in Pandora.

The move from physical to online text is a bit like the move from gold to paper money. The fundamental issue is trust. In the future, paranoid survivalists will not just have cellars full of axe handles and canned food — they will have books. Lots of books.

Note: Jacques has since restored my name to the Troppo posts.

This entry was posted in Uncategorized. Bookmark the permalink.

6 Responses to Down the memory hole (or how I went from man to mouse)

  1. To the extent they have not been destroyed in Catallaxy’s various server crashes, my old posts there are attributed to Sinclair Davidson. So far this has not caused any trouble that I am aware of, though it could trip up people seeking to take a swipe at either of us on errors or alleged inconsistencies with current views.

    I have mixed feelings about alterations – if someone has made an error, better in most cases that it be corrected than worry about a pure historical record. On my blog, I make it clear where there has been a substantial correction.

  2. Like Andrew, my (preserved) posts at the Cat have been attributed to other people, but not to Sinclair (I don’t think). It seems to be mainly Heath G.

    That said, the various server crashes have meant large numbers of them have vamoosed entirely.

    Ah, ephemera, I know thee well…

  3. A lot of catallaxy posts are in PANDORA as well, and a while back I recovered another batch of them from various web caches using a tool called Warwick. From time to time I tinker with some truly hairy scripts to transform those archival versions into something that could be restored to the database, but it’s a less than pleasant problem.

  4. Don Arthur says:

    The Wayback Machine has Catallaxy from March 2004 through to October 2006.

    http://web.archive.org/web/*/http://badanalysis.com/catallaxy/

  5. Tel says:

    Ultimately those works with more generous Copyright licenses will survive, and those with more restrictive Copyright will vanish. Evolution in action.

    Thanks for posting the UWS story about the buried books, I’m quite surprised to see that. Most libraries offer the old books up for sale to students and general public for low price.

Leave a Reply

Your email address will not be published. Required fields are marked *

Notify me of followup comments via e-mail. You can also subscribe without commenting.