The National Archives
Search our website
  • Search our website
  • Search our records
Image of software box and CD PRONOM
Welcome (PRONOM  home page) About PRONOM Add an entry
Search Help - opens in a new window Information resources - opens in a new window

*Details: File format summary



Search by keyword Search by file format Search by PUID Search by software Search by vendor Search by lifecycles Search by Migration Pathway

Details for:

Save as... XML | CSV Printer friendly version


Other names ISO 28500-2009, Web ARChive file format
Identifiers MIME:  application/warc
PUID:  fmt/289
Classification Aggregate
Description The WARC (Web ARChive) file format offers a convention for concatenating multiple resource records (data objects), each consisting of a set of simple text headers and an arbitrary data block into one long file. The WARC format is an extension of the ARC file format (ARC) that has traditionally been used to store "web crawls" as sequences of content blocks harvested from the World Wide Web (…) Besides the primary content recorded in ARCs, the extended WARC format accommodates related secondary content, such as assigned metadata, abbreviated duplicate detection events, later-date transformations, and segmentation of large resources”. WARC format has been written by the members of the IIPC ( grouped within the ISO/TC46/SC4/WG12.
Byte order  
Related file formats Has lower priority than WARC (1.1)
Has lower priority than WARC (1.0)
Has priority over Extensible Markup Language (1.0)
Has priority over Hypertext Markup Language (2.0)
Has priority over Hypertext Markup Language (3.2)
Has priority over Hypertext Markup Language (4.0)
Has priority over Hypertext Markup Language (4.01)
Has priority over Extensible Hypertext Markup Language (1.0)
Has priority over Extensible Hypertext Markup Language (1.1)
Has priority over Hypertext Markup Language  
Has priority over Hypertext Markup Language (5)
Technical Environment  
Released 15 May 2009
Supported until  
Format Risk  
Developed by None.
Supported by Bibliothèque nationale de France
ISO Technical Committee - TC 46/SC 4 / ISO/TC46/SC4/WG12
Source date 02 Nov 2010
Source description Submitted by Bibliothèque nationale de France with supplementary description information attributed to ISO 28500-2009 standard. 11/2023 (v.116)- Priority over fmt/101 Extensible Markup Language 1.0. Internally researched.
Last updated 28 Sep 2020
Top of page Top of page
The National Archives Newsletter Icon

Send me The National Archives’ newsletter

A monthly round-up of news, blogs, offers and events.