Tips for searching for an archived website

Finding captures of a URL

Enter its URL into the search box.
If the page is in the archive, you will be taken to a Timeline page displaying a calendar of all available captures.
Select a capture date to view the archived webpage as it appeared at that time.

Refining the Timeline Page

Show redirects – Displays instances where a webpage redirected to another URL, helping track content movement.
Show one instance per day – Since web crawlers may capture multiple versions of a page daily, this option provides a clearer overview by limiting results to one per day.

Performing a full-text search

Use the main search box to search across the entire archive. You can refine your queries using:

Search phrases – Use quotes for exact matches, e.g., “Budget 2010”.
Keyword searches – Searching for Budget 2010 (without quotes) returns pages containing both terms, even if they are not together. Multiple terms can be entered using spaces, commas or semicolons as separators.
Combining terms – Use a mix of phrases and individual words, e.g., “Budget 2010” Costings, to find results where both appear.
Automatic Boolean logic – Spaces between words act as an AND operator, ensuring all words must be present. Results where terms are closest together rank higher.

Enhancing your search experience

Adjusting Search Terms

Search terms appear as bubbles below the search bar.
Click the [x] on any bubble to remove a term and refine your search.

A screenshot of a search results page showing the location of the bubbles containing the search terms.

Location of bubbles containing search terms

Website-specific searches

When refining your search by website, the archive recognises www.example.com and example.com as the same site, but with differences in search behaviour:

Searching with www. returns results only for that specific subdomain.
Searching without www. triggers a wildcard search, including all subdomains (e.g., example.com, blog.example.com, news.example.com), ensuring broader results.

Filtering search results

Refine your search results by applying filters. The key filtering options include:

Filter by keyword – Exclude specific words or phrases from results. These appear as bubbles prefixed with “Excluding:”, acting as a NOT Boolean command.
Filter by website – Limit your search to specific websites or exclude sites from results. These appear as bubbles prefixed with “Site:” or “Excluding Site:”.
Filter by year archived – Search based on when content was archived rather than when it was published. Selected years appear as bubbles with the prefix “Year:”
Social Media Archive exception: For social media content, filtering by year refers to the year published, not archived.

Exporting search results

You can export a range of search results for further analysis, limited to the first 10,000 results per search.

Available export formats

CSV (Comma-Separated Values) – For spreadsheet applications.
JSON (JavaScript Object Notation) – For structured data processing.
NDJSON (Newline-Delimited JSON) – For handling large datasets.

Exported data fields

Each archive includes different metadata fields:

UK Government Web Archive (UKGWA)

urlKey – Normalized URL identifier.
timestamp – Date and time of capture.
url – Full URL of the archived page.
host – Host domain of the website.
mime – MIME type of the archived content.
digest – Unique hash identifier for the content.
length – File size of the archived content.
offset – Storage location offset in the archive.
textTitle – Extracted page title.
language – Detected language of the content.
mimeHuman – Human-readable MIME type descriptor.

UK Government Social Media Archive

_source – Root data.
content_title – Title of the social media post.
account_ident – Public identifier of the account.
account_id – Unique identifier for the account.
archived_at – Timestamp when the post was archived.
content_text – Textual content of the social media post.
platform_id – Unique post identifier on the platform.
created_at – Original timestamp of the post.
platform – Social media platform (e.g., YouTube, Facebook).
local_display_name – Account display name at the time of archiving.
local_account_ident – Local identifier of the account at the time of archiving.