Creating an official inquiry website

Creating and maintaining the official public inquiry website

The official public inquiry website will be a useful tool for sharing information and publishing reports or evidence. This resource is a primary record of any inquiry, and will be captured by The National Archives into the UK Government Web Archive. Therefore, the management of an inquiry website should be part of a wider approach to information and records management. If you would like guidance before the inquiry website is created, please email us: webarchive@nationalarchives.gsi.gov.uk.

Here are some things that can be done to help make websites easier to archive:

Website crawling and technical requirements

  • contact The National Archives as early in the process as possible, so that we are aware of the website and can work with the inquiry team to make preparations for its archiving. Once this contact is established, The National Archives can provide updates on progress and arrange the final crawl of the website once the inquiry is dissolved. You can contact the web archiving team at webarchive@nationalarchives.gsi.gov.uk
  • the initial crawl of the inquiry website will start early in the process, when there may be little content hosted on the website. This is both for posterity and to assess the suitability of the design of the website for web archiving
  • keep all content under one root URL (for example http://www.mydomain.gov.uk/). As the scope of the web crawl is for content within this root, content from outside this will not automatically be archived. This makes it easy to identify that the entire site has been captured, thus ensuring transparency of process and data during and post the inquiry
  • remember that content posted on externally owned websites, such as Flickr or YouTube, may also need to be managed and preserved either via your own website or other electronic systems
  • only content linked to from a page within scope of the crawl will be archived, as the crawler relies on discovering links in the coding of the page
  • present everything on your website through the HTTP protocol
  • use meaningful URLs. These are good practice for a number of reasons, including usability, security, and search engine optimisation
  • due to the technical architecture of the web archive, we are able to archive, but are currently unable to provide access to, any file of 20MB and over. This affects all file types. The National Archives recommends splitting any large files into ‘parts’ (as with http://7julyinquests.independent.gov.uk/evidence/na-videos.htm)
  • keep navigation as basic as possible, by providing static links, link lists and basic page anchors, rather than JavaScript and dynamically generated URLs. If using scripting (such as JavaScript) on your website, provide plain HTML alternatives – this supports accessibility for users and supports archiving
  • it is not usually possible to crawl databases. Any data held in databases should be published on the website using basic, static links
  • provide an XML sitemap, which lists and links to all of the content on your website. This is useful for users, makes your website more findable by search engines and supports archiving
  • information needs to be ‘machine reachable’, which means that it can be reached by a web crawler. Information that needs a tick box, pick list, drop-down menu or a search box to access it is not machine reachable and so cannot be captured by a web crawler. If this functionality must be a feature of the live website, provide plain HTML alternatives
  • The National Archives can only archive publicly-accessible content. Any content that is behind log-ins or in other inaccessible areas, should either be published on the website if appropriate, or accessioned by other means

Multimedia

  • it is not possible to capture and replay streaming audio-visual material. Any content of this sort must be accessible via progressive download instead, over HTTP, where the source URL is not obfuscated
  • audio-visual material should be linked to using absolute URLs (http://www.mydomain.gov.uk/video/video1.mp4) rather than relative URLs (…video/video1.mp4) in the coding of the page
  • consider providing full transcripts of all audio-visual material
  • where the inquiry website includes third party audio-visual material, inquiry staff will need to arrange for assignment of copyright to The National Archives or at the least permission from the copyright owners to reproduce the content from the UK Government Web Archive both during the inquiry and in perpetuity following the end of the inquiry lifetime
  • capturing of Flash elements in pages presents significant challenges, due to their complexity, and we cannot guarantee these will be archived

Quality assurance and maintenance

  • in order to ensure successful archiving, it is necessary to figure in the time required for crawling, quality assurance, fixing any issues and publishing the crawl in our public index. This take approximately eight weeks. Please ensure that the website will remain live and unchanging for this period, so that The National Archives can take a final and complete snapshot
  • The National Archives strongly recommends that those involved in the release of inquiry records, or are otherwise familiar with the design of the website, are available during the quality assurance stage so that comprehensive capture can be confirmed
  • no content can be inserted into the archived website after the live website has been taken off-line. Any content not available on the website at the time of crawl, or not accessible because the above guidelines have not been met, cannot be inserted into the web archive after the live website comes down
  • the underlying code of an archived website cannot be altered in the web archive. That means that website managers should confirm that their website is ready to be archived and that the content will remain perpetually unchanged
  • content can only be removed from archived websites in exceptional circumstances, when it adheres to one or more of the criteria set out in the Takedown Policy
  • retain the domain after the final snapshot of the website has been made. This is essential as it prevents ‘cybersquatting’ and can give users continuity of access to the inquiry’s online records, if a redirect is set up into the web archive. For more information see TG125 (below).

For further information on web archiving, see Cabinet Office web standards Archiving websites (TG105) and Managing URLs (TG125).

Find out more:

Cabinet Office – Archiving websites

Information on web archiving