The end of archives

In the dead days between Christmas and New Year, when news can only report on its absence, an announcement was made by the Library of Congress which has a major implication for archives and public memory. The Library will no longer attempt to archive Twitter comprehensively.

In April 2010 the Library of Congress and Twitter signed an agreement by which Twitter agreed to make available the text only of all public tweets from 2006 to that date, and to supply such texts to the Library thereafter on an ongoing basis.

On the face of it, this seemed an opportune and necessary initiative. Twitter had become a mechanism by which the world conversed with itself, transcending borders, distributing news and opinion at extraordinary speed and scale, and providing a voice for both the prominent and the obscure amongst us on any issue of the day. This was the archive of our times, and the Library of Congress, as an American institution, was the natural home for the records creation by Twitter, an American company.

In practice, things were rather more complicated. Firstly, there were considerable access issues involved. Contrary to what many of its contributors may believe, Twitter is the owner of the data that we add to it. Nevertheless, this does not make the issue of what can be done with the data any easier, since to republish it – as text, or analysed through data mining – could be contrary to the expectations, or in some cases the legal rights of those contributors. The Library needed to respect those rights, and the intractability of the issue was one reason why the Twitter archive was not made available.

The other reason was the sheer size of the archive and the complexities involved in making it accessible retrospectively, as opposed to a live stream. It is this latter issue that has led to the recent announcement. Because Twitter is no longer just about text (many tweets now have a visual, or even video, element), because the number of characters allowed per tweet has recently risen from 140 to 280, and because the plain number of tweets has risen so hugely, the Library calculated that it could no longer cope. In its statement it says:

As the twelfth year of Twitter draws to a close, the Library has decided to change its collection strategy for receipt of tweets on December 31, 2017. After this time, the Library will continue to acquire tweets but will do so on a very selective basis under the overall guidance provided in the Library’s Collections Policy Statements and associated documents ( Generally, the tweets collected and archived will be thematic and event-based, including events such as elections, or themes of ongoing national interest, e.g. public policy.

In its arguments in support of this decision, the Library stresses that “generally [it] does not collect comprehensively.” It had made a special exception for Twitter, but with the service, and social media generally “now established” it was bringing its collecting practice “more in line with its collection policies”. From an archiving point of view, normal service was resuming.

The archiving of Twitter is a logical impossibility. There is no single Twitter out there that might be consulted equally by any of us. There are over 300 million Twitters in existence. Each person signed up to the service selects who they will follow and what topics interest them. No one person sees the same Twitter as the next. It is universal and absolutely personal at the same time, which is the key to its particular power. No archive can replicate this, because it must convert the subjective into the objective.

Of course, the Twitter archive still represents a vast textual archive, which can be analysed by theme and date. It records that which the individual participants will have said. But it will have lost the world of thought that underpins Twitter – its private status as opposed to its public appearance. Twitter, our understanding of it, exists only its liveness, and when we look at it.

Back in 1995, at the dawn of our networked age, Jacques Derrida wrote an analysis on the nature of archiving, Archive Fever. It is a dense study on the nature of what it is that we do and do not archive of ourselves, and why, and one of its most interest elements is its discussion of e-mail. In 1995 few knew of the Internet, and social media did not exist, but from e-mail Derrida deduces much of how the new media would change what we are, or what we may leave behind of what we are:

[T]he example of E-mail is privileged in my opinion for a more important and obvious reason: because electronic mail today, even more than the fax, is on the way to transforming the entire public and private space of humanity, and first of all the limit between the private, the secret (private or public), and the public or the phenomenal. It is not only a technique, in the ordinary and limited sense of the term: at an unprecedented rhythm, in quasi-instantaneous fashion, this instrumental possibility of production, of printing, of conservation, and of destruction of the archive must inevitably be accompanied by juridical and thus political transformations. These affect nothing less than property rights, publishing and reproduction rights.

These latter points are now taxing the Library of Congress, but Derrida goes on to argue how the nature of archiving determines that which is archived:

[T]he technical structure of the archiving archive also determines the structure of the archivable content even in its very coming into existence and in its relationship to the future. The archivization produces as much as it records the event. This is also our political experience of the so-called news media.

Some of this is obvious enough – the process changes that which is being processed. But this is not just about the effects of institutionalisation. The suggestion that ‘archivization’ produces as much as it records reads like a foretelling of big data analyses, in which more may be derived from the archive than that which it ostensibly recorded. This is the archive as mind, whose logic – as with Twitter – lies in its revealing liveness.

The archive lets us see what the world has said about itself in a particular way only because it has been archived. This is true for all archives at all times in their history. What was different for Derrida about e-mail was its speed, the transformative promise (and threat) of its instantaneous nature.

Since Derrida wrote his book, archiving has indeed transformed, not simply in its method but in how it is controlled. In her fine book on the changing nature of archiving, When We Are No More: How Digital Memory is Shaping Our Future, Abby Smith Rumsey (formerly of the Library of Congress) writes:

In market economies, commercialization of “free” communication channels such as Facebook and Twitter sparks debate about a host of economic, political, and social challenges. Overlooked, however, are the potentially serious long-term implications for memory, both individual and collective. The long-term future of collective memory is not the business of commercial companies. By necessity they have a short time horizon and we cannot expect them to invest adequately in preserving their information assets for the benefit of future generations when these assets no longer produce enough income to pay for their own care and feeding in data archives. The problem for collective memory is not commerce’s narrow focus on quarterly returns, deleterious as that is for any long-term planning. It is that commercial companies come and go. When they are gone, so, too are all their information assets. Unlike institutions established to serve the public trust, commercial companies have no responsibilities to future generations. The simple solution for preserving commercially owned digital content is for companies to arrange for handoffs of their significant knowledge assets to public institutions.

She then gives the example of the donation by Twitter of its archive to the Library of Congress.

What is questionable, though, is the assumption that our present archival institutions are the right bodies for preserving our digital world. Rumsey’s arguments about the short life of commercial institutions, and their indifference to knowledge assets once they cease to be assets, are borne out by archival history (consider the history of film archives, for example, founded out of necessity when film studios abandoned their commitment to older assets in the switch from silent to sound film). But the rules may be changing. Can we seriously see Google or Facebook disappearing? The life of some nation states – usually the defining hosts of the sort of public institutions Rumsey champions i.e. national libraries and archives – would seem to be less long-term. How long will Spain stay together, or Belgium, or the United Kingdom?

More than this, the media giants such as Google, Facebook or Twitter, transcend the national. They operate globally, think beyond boundaries, and continually challenge systems (such as copyright) which were conceived out of national thinking. They are remaking the world in their image, and traditional institutions are not simply struggling to keep up technically but in meaning as well.

The Library of Congress, via Wikipedia

There is argument for separating the archiving of the physical and the digital. Our existing archives, libraries and museums look after the physical very well. They manage the space and can measure likely growth, they understand the optimum conditions needed for long-term preservation, they have the objects in their sights. But the digital is growing beyond them. No one is archiving Facebook – how could they? I’ve written before about the vastness of the YouTube archive, how it dwarfs the pretensions of our conventional moving image collections. Who else could archive YouTube except its owner, Google? The unique nature of such supranational, networked, ever-growing digital resources demands that they who maintain them have the responsibility for sustaining them.

We could imagine an arrangement whereby national archives give up the impossible task and governments enjoin the giant online corporations to ensure the preservation of their ‘archive’, and access to this, in return, say, for legislative concessions of some kind – copyright-related, probably. Facebook, Google, Twitter et al would become what they already are, in effect, the archive of themselves at the same time as they remain the living entity. It is the separation of producer from custodian that no longer works.

Of course there is so much that is wrong with this as a line of argument. Commercial companies come and go, even if some may last for a far longer time than we may imagine, because they govern our lives – directly, or through their data. Twitter itself may not last long – despite its high profile it is in a poor state financially, not yet turning a profit. And there is the absence of the public trust which is enshrined in traditional archiving. Abby Smith Rumsey writes:

Today, most of our personal digital memory is not under our control. Whether it is personal data on a commercially owned social media site, e-mails that we send through a commercial service provider, our shopping behaviors, our music libraries, our photo streams, even the documents on our hard drives written in Word or Pages – they will be inaccessible to us, unreadable in only a few years … We view our Facebook pages and LinkedIn profiles as intimate parts of ourselves and our identities, but they are also corporate assets.

None of these companies is to be trusted, for a second. They exist for themselves, not for us. In their thinking, we exist for them. Public memory should be more than this.

But this is a reality already. The Library of Congress cannot cope with the archiving of Twitter any more, as the latter grows exponentially. How very much larger will Twitter be in five, ten years’ time? Or a hundred years? YouTube is growing at the rate of 400 hours per minute. It is beyond the capacity of any traditional archive as things stand today. Just imagine where it will be decades from now.

These new archives are not going to disappear. Too much has been invested in them, in infrastructure, and of ourselves. As published objects are increasingly rented to us rather than sold (the Netflix model), so the traditional understanding of ownership withers, and with it the public archive as representing the ownership of all of us. Of course, the older material will increasingly decline in interest, and consequently in commercial value (you can’t sell any products off the Facebook page of a deceased person, and how many videos from YouTube’s inaugural year of 2005 are still being viewed avidly?). But by the time that problem becomes acute, the only bodies capable of tackling it will be the owners. They will come to governments and strike a deal – you help us, and we will be your archival servants.

Perhaps Derrida foresaw this when he writes about ‘archive fever’ (the urge to control through the act of archiving that he compares to Freud’s notion of the death drive):

Above all, and this is the most serious, beyond or within this simple limit called finiteness or finitude, there is no archive fever without the threat of this death drive, this aggression and destruction drive. This threat is in-finite, it sweeps away the logic of finitude and the simple factual limits, the transcendental aesthetics, one might say, the spatio-temporal conditions of conservation … There is not one archive fever, one limit or one suffering of memory among others: enlisting the in-finite, archive fever verges on radical evil.

I don’t entirely understand what he is going on about, but I think I can see the results. Our archives have outgrown us, and are no longer our own.

Or will the libraries of the future cope because they will be build out of DNA? Jerome de Groot has produced a fascinating post for The Conversation on the potential of biology and nanotechnology to deal with the rapid escalation in data. But, as he concludes, “Wider questions arise about the ethics of collection and to what extent these processes will become mainstream. Print, and to a certain extent digital, have become common and reasonably democratic ways of transmitting and storing information. It remains to be seen whether future storage and writing will be as easy to access, and who will be in control of humanity’s information and memory in the coming decades and centuries.”


View all posts by

Leave a Reply

Your email address will not be published. Required fields are marked *