The disappearing archive


It’s well known how vast YouTube is, and the rate at which it is growing. Recent figures suggest that 400 hours of video are added to the site every minute (back in 2013 it was a mere 100 hours per minute), and that it is serving some six billion video views per day. It is very difficult, however, to find out just how many videos have been uploaded onto YouTube – Google does not publish any such figure, and estimates (calculated by number of hours uploaded and average duration of videos) vary wildly. I reckon it’s 2.7 billion (see calculations at the end of this post).

One reason why the total figure is kept quiet is because of the number of videos that get taken down from the site. Again, no figures are supplied for this, but my own experience with one small site which embeds videos from YouTube (and Vimeo) gives an idea of the possible proportions. BardBox is a site I set up in 2008 to curate original Shakespeare videos that could be found online, chiefly on YouTube. The site was in a blog form, with one video embedded per post, plus some commentary from me. I kept the site going until 2012, when I decided not to add anything more to it. But I kept it active, and became increasingly concerned at the number of videos on the site that were taken down, or more to private accounts. Some had infringed copyright, some the owners were perhaps embarrassed by, or no longer saw as useful to them, or just wanted to keep to themselves. So recently I did a bit of a spring-clean on the site and found that of the 139 videos on the site, 34 had been removed in one form or another. That’s a quarter.

Now, a set of just 139 videos might not enough for a sensible calculation to be made, but the Shakespeare videos I collected were, I think, a reasonably representative collection of the different kinds of video production to be found on YouTube. They included professional product, trailers, animations, mashups (with contentious re-use of video and music which could make them prone to takedown), fan videos (ditto), clips (ditto), personal videos, school projects, and so on. It was a varied mix, intended to show how Shakespeare had been appropriated across the many different kinds of video format to be found online. And now 25% percent of it has gone.


I’ve often referred to YouTube as an archive, not least as a way of challenging conventional film archives to think about their choice of content, their popular reach, and their relevance. But is it any such thing? It is an archive insofar as it holds on to and preserves cultural content – so far as is known, it holds onto everything that is uploaded to it, even if the files are subsequently withdrawn, either by the user or by YouTube/Google itself. It provides access to that content, as an archive should strive to do. But what sort of archive withdraws its holdings from access after they have been made publicly available?

Of course, this isn’t YouTube’s fault. If the owner of a video wants to withdraw their video from circulation, or if a video infringes copyright or breaches some other kind of law, then YouTube obliges. It is not in complete control of its holdings. Few archives are, of course, and blocks to access do get imposed, where a donor may demand a halt to access (e.g. because of a dispute with the archive, or because a loan period has come to an end), or access may need to be withheld because of possible libel issues. But it’s the scale of things that makes YouTube’s case a special one. It has given us the false sense that it has everything, and that everything will always be available. It has become all too common in film studies and the like for knowledge and access to be determined by what exists on YouTube. If a video doesn’t exist there, then it doesn’t really exist at all and becomes unworthy of consideration. YouTube has made us greedy, and myopic.

It might be more accurate to think of YouTube as a gallery. Galleries cannot show everything (though in their case this is for reasons of exhibition space, not donor whim or legal infringement). The full collection lurks in store somewhere, and some of it may not ever see the light again. The gallery is committed to showing a selection. We could not cope with the gallery that always showed us everything.

Whatever we call YouTube, the fact is that it has created a mechanism for content to appear, then disappear. Web archives often struggle to manage rich media content, and no web archive could manage something on the scale of YouTube in any case (the Internet Archive captures regular snapshots of YouTube’s front page, but the videos themselves are not there). YouTube exemplifies an impatient world which wants that which is immediate and forgets that which is not. It’s an inevitable outcome of the behemothic scale of YouTube, and its contributor model, but I know from just the small case of my Shakespeare site that much original and worthwhile content has disappeared. And it is important to think of what has disappeared, just as with silent films where we consider lost films as well as those that survive, in order that we understand fully the history.

YouTube’s not alone in this, of course. Our television and radio broadcasters pump out millions of hours of content most of which is soon removed from publicly-accessible platforms, and only a small proportion of which is retained in public archives. But we’re used to this, we know the deal, frustrating as it may be. YouTube promised something different, the archive to shame all other archives by its determined belief in untrammeled access. Instead it’s given us the disappearing archive, and won’t tell us what it is that has disappeared.

So, how many videos have been uploaded on YouTube? Well, it was founded in 2005 and in 2006 it reported that it was receiving 65,000 videos per day, or 45 per minute. In 2015 it has reached 400 per minute. Working from the incremental reported annual growth figures, that adds up to over 10.8 million hours uploaded to YouTube all told. The average length of a YouTube video is said to be 4 mins 12 seconds. So that would come to roughly 2.7 billion videos. Do we dare suppose that 25% of that, equalling 675 million videos, is now lost?



View all posts by

One thought on “The disappearing archive

Leave a Reply

Your email address will not be published. Required fields are marked *