Limitations of the UK Government Web Archive

Because the archiving process stores the sites in a different format, not all content on an archived site will work or display properly.

All web archives are a snapshot, or representation, of what was online and accessible to the crawler at the time of the crawl and not a full working copy of a website. This is because the underlying systems (or ‘backend’) of the website cannot be archived using the remote harvesting method. The web archive is not a ‘backup’ of a website from which the original website can be restored at a later date.

These are the main things that can’t always be fully preserved in a working state:

  • Links from archived websites to other non-government websites. For example, if you are viewing the archived version of the NHS website, the links to Facebook, YouTube, or the BBC won’t work.
  • Links inside documents (.pdf, .doc, .docx, .xls, .xlsx, .csv documents) do not currently work in the web archive. If a user clicks on a link preserved in a document they will be taken to that location on the live web, not in the archive.
  • Interactive content – any content that requires user input to proceed, for example: interactive or embedded charts/diagrams/data visualizations, tick boxes, login fields, site search, contact forms, quizzes, some interactive animations, and anything that requires use of complex JavaScript.
  • Content that can only be reached by a user logging in, for example intranets or secured areas.
  • Certain navigational features, for example drop-down menus and search.
  • Document libraries or image galleries, or similar areas with large collections of content items can sometimes be difficult to capture correctly.
  • Flash animations and games or streaming media.
  • Embedded maps, such as Google Maps or OpenStreetMap.
  • Embedded social media, for example embedded videos from Vimeo or YouTube or embedded Twitter feeds.
  • Any social media platforms other than YouTube, Twitter, Flickr and Instagram.
  • POST and Ajax functionality (most often used for uploading documents or completing web forms).
  • E-commerce sites and functions.

You can read our core archiving requirements for a more comprehensive list of the limitations and our recommended alternatives/solutions if you have any of these on your website.