- UK Government Web Archive
Information on web archiving
The National Archives is preserving the online presence of UK central government by capturing its websites and social media in the UK Government Web Archive (UKGWA). The National Archives’ Operational Selection Policy OSP 27: UK Central Government Web Estate provides further detail.
On this page you can access information that may help you in using the UKGWA and the social media archive. There is an overview of the technical limitations of web archives and guidance on how to use the Memento tool to see what a website looked like at a previous point in time. You can read our statement regarding the re-use of content in the UKGWA, about the development of the service, our takedown policy and how to contact us.
Using the UK Government Web Archive
- Finding Content in the UK Government Web Archive (PDF, 0.19Mb)
Users with technical skills may also find this document of interest. For example, it explains how to gain access to our two application programming interfaces (APIs), as used in this blog post.
We have made available a bookmarklet to make checking the web archive easier. To use it, drag the following button into your bookmark or favourites toolbar:
Then, to check if the web archive has captured a particular resource, simply access the URL in question and click on the bookmarklet. This will take you to the star (*) index for that resource, if it has been captured.
You may be redirected to the UKGWA if what you're looking for has moved, been removed or has changed. We've put a red banner at the top of each archived page and [ARCHIVED CONTENT] in your browser's titlebar to show you’re in the web archive. This is important as contact details, for example, on archived websites, may not be current. We recommend that you use contact details found on the live web for organisations which still exist or contact the department now responsible for the functions of an organisation which has closed.
All the sites in our collection have been archived individually. External links in archived websites are unlikely to work. In that case, you may wish to try other another web archive, such as the Internet Archive, as their collection includes a wider range of websites than the UKGWA.
Using the social media archive
The National Archives has developed automated tools to efficiently capture and provide suitable access to social media content. Thousands of videos and over 65,000 tweets originally published online by UK central government organisations were captured during the pilot stages of a two year project. This collection will continue to grow alongside our wider web archiving activities. We estimate that the capture of new video content will take place on an annual basis and plan to monitor the volume of tweets produced by government and archive Twitter accordingly.
The social media archive follows the principles of our approach to web archiving by preserving the open digital record in a way that keeps it accessible, retains its context and makes it available for reuse. The earliest archived content available dates from 2006 and covers some major events in our recent history, including the London 2012 Olympic Games, and gives an insight into how government is using these digital tools to communicate.
Our Twitter archiving activity has been guided by the following rules that have informed our approach to building effective technical solutions that can work at scale:
- In: The tweets made by UK central government organisations and the official London 2012 Olympic and Paralympic Games accounts are captured. Where these tweets contain a link to web content that is included in the UKGWA users can generally expect the link to behave almost as it would on the live web as it will resolve in full to an archived version of that website.
- Out: Re-tweets made by these government accounts are excluded and tweets sent from non-government accounts that form part of a conversation on Twitter but don't appear in the API for the accounts we're collecting (e.g. replies, or tweets directed at the government accounts) haven't been preserved. Tweeted links that direct the user to web domains that are not in scope for our other web archiving activity (e.g. newspaper websites) will lead to a 404 or a 410 error message that allows users to see the destination of the link either in the address bar of their browser or within the error message itself so it is possible to locate the material elsewhere.
The beta version of the video archive includes a search function that searches across the video titles, as given by the publishing department. The Twitter content does not have a search option at present but it is possible to use the JSON and XML files we have published to interrogate and analyse the information contained in the tweets.
Technical limitations of web archives
Archived websites are often unable to offer the same functionality as original, or "live", websites. We have produced a list of the most common problems encountered by our users.
Web archiving necessitates continual research into both capture (retrieving and storing data) and accessing (presenting the archived data to our users). This research should lead to improvements in the future.
To make sure that websites are appropriately designed we also produce technical guidance for website managers.
Memento in the UK Government Web Archive
Memento is a tool which allows users to see a version of a web resource as it existed at a certain point in the past. It was originally developed by researchers at Los Alamos National Laboratory in the USA and has now been made available for use in several web archives.
In order to use Memento you will need to use either Firefox or Chrome as your browser. You will then need to install the MementoFox Add-on for Firefox or the Memento Time Travel extension for Chrome before following the relevant set of the following instructions:
To configure MementoFox for Firefox:
- Access the Add-on's "Preferences" menu.
- Select the "Timegates" tab.
- Double-click on the line in the 'Timegate' window which reads 'New' and add http://webarchive.nationalarchives.gov.uk/timegate/. Click on the 'Up' button to move the text you have just added to the top of the list. Two timegate URLs should now appear in the list with the webarchive.nationalarchives.gov.uk URL at the top.
- Click the 'Save' button in the 'Timegate' window.
To configure the Memento Time Travel extension for Chrome:
- Access the extension's "Options" menu.
- Insert http://webarchive.nationalarchives.gov.uk/timegate/ in the text box at the bottom of the page.
- Click "Update".
Memento should now be ready to use. Type the URL of the resource you want to view in the Firefox or Chrome address bar. Then either use the slider or alter the date in the date box to move back and forward through time.
For example, to view all versions of the HM Revenues and Customs website, type the URL of the website (http://www.hmrc.gov.uk/) in the address bar and then move the slider, or enter a specific date, to view the resource closest to that date in the UKGWA.
Re-use of content accessible through the UK Government Web Archive
Most, but not all, of the websites accessible through the UKGWA were created by Crown bodies and are Crown copyright. Most of the archived content of these websites and services is also Crown copyright. Unless otherwise stated, you may re-use Crown copyright material obtained from the UKGWA freely under the terms of the Open Government Licence.
Where websites have used third party (non-Crown) material the copyright status of this material should be clearly stated on the site, either attached to or embedded within the material itself or on the copyright page on said site. In such cases the third party content is not re-usable under the Open Government Licence and the onus for obtaining the consent of the copyright owner rests with the person or organisation who wishes to re-use it.
Please note that the Open Government Licence does not permit the re use of personal information and that photographs that depict an identifiable individual can constitute personal data for the purposes of the Data Protection Act.
In addition to the above, further restrictions apply to the reuse of material originally published by government bodies, such as the Ministry of Defence (MoD), that have been granted a delegation of authority by the Controller of Her Majesty's Stationery Office (HMSO). For example, material published on MoD websites may be reproduced for the purposes of non-commercial research or private study and for the purposes of reporting current events only, unless other terms are set out against the respective content. You should check the relevant MoD copyright licensing information before assuming it is acceptable for you to copy and / or re-use the material under the Open Government Licence.
The National Archives does not warrant that all third party content is appropriately marked. The re-use of any copyright material that is not clearly identified as being Crown copyright is not authorised by The National Archives. It is your responsibility to ensure that you have any necessary permission for the re-use of copyright material obtained from the UK Government Web Archive.
Development of the UK Government Web Archive
Web continuity and perpetual access to online documents
Archiving websites helps to ensure continuing access to government's online information. We ask government website managers to help us to provide a web continuity service that enables online access to government information over time. This can be achieved by installing a simple piece of software that will redirect users to the web archive if the information they are seeking has been moved or removed from its original location.
Web continuity means not getting a 'page not found' error message when you click on a web link on a government website, even if the information linked to has been removed, or moved.
The web continuity project that devised the technical solution that is now managed by The National Archives was established in 2007 to address concerns raised by Jack Straw, then leader of the House of Commons, about broken links in Hansard. Membership included The National Archives, the British Library, the Parliamentary Libraries, the Central Office of Information, and website managers from several government departments.
History of the UK Government Web Archive
Our web archiving programme began in 2003. We originally harvested around 50 selected government web sites using a not-for-profit specialist company called Internet Archive. By doing so, we also gained access to their back catalogue, meaning that you can find some sites dating as far back as 1996.
The National Archives was a founder member of the UK Web Archiving Consortium which, between 2004 and 2009, worked to develop a common shared infrastructure for the selective archiving of websites. Websites selected for archiving by The National Archives during this period are available through the UK Web Archive and the UK Government Web Archive.
Since 2005 archiving has been carried out under contract to the Internet Memory Foundation, a not-for-profit organisation.
We started archiving government websites due to close under the government's website review programme, which started in January 2007. The programme aimed to reduce the increasing number of government sites in order to provide a clearer and more user-friendly service for the public. From November 2008 we began to archive a larger number of sites and to archive some sites at an increased frequency to support our Web Continuity Initiative.
More recently, we supported the creation of the single government domain, GOV.UK, by archiving websites so that users could be appropriately redirected to the UKGWA.
We welcome feedback on the UK Government Web Archive. Please email us at firstname.lastname@example.org
This page contains PDF files. See plug-ins and file formats for help in accessing these file types.