Summary
|
Name |
WARC |
Version |
|
Other names |
ISO 28500-2009, Web ARChive file format |
Identifiers |
MIME:
application/warc PUID:
fmt/289
|
Family |
|
Classification |
Aggregate |
Disclosure |
|
Description |
The WARC (Web ARChive) file format offers a convention for concatenating multiple resource records (data objects), each consisting of a set of simple text headers and an arbitrary data block into one long file. The WARC format is an extension of the ARC file format (ARC) that has traditionally been used to store "web crawls" as sequences of content blocks harvested from the World Wide Web (…)
Besides the primary content recorded in ARCs, the extended WARC format accommodates related secondary content, such as assigned metadata, abbreviated duplicate detection events, later-date transformations, and segmentation of large resources”.
WARC format has been written by the members of the IIPC (http://www.netpreserve.org/) grouped within the ISO/TC46/SC4/WG12. |
Orientation |
|
Byte order |
|
Related file formats |
Has lower priority than WARC (1.1) Has lower priority than WARC (1.0) Has priority over Hypertext Markup Language (2.0) Has priority over Hypertext Markup Language (3.2) Has priority over Hypertext Markup Language (4.0) Has priority over Hypertext Markup Language (4.01) Has priority over Extensible Hypertext Markup Language (1.0) Has priority over Extensible Hypertext Markup Language (1.1) Has priority over Hypertext Markup Language Has priority over Hypertext Markup Language (5)
|
Technical Environment |
|
Released |
15 May 2009 |
Supported until |
|
Format Risk |
|
Developed by |
None.
|
Supported by |
Bibliothèque nationale de France
ISO Technical Committee - TC 46/SC 4 / ISO/TC46/SC4/WG12
|
Source |
|
Source date |
02 Nov 2010 |
Source description |
Submitted by Bibliothèque nationale de France with supplementary description information attributed to ISO 28500-2009 standard. |
Last updated |
28 Sep 2020 |
Note |
|