Webarchive of the National Library of the Czech Republic has been regularly harvesting and archiving Czech internet since 2000 and is thus one of the first web archives in the world. It uses standard technologies for data harvesting and preservation of saved data, is member of the International Internet Preservation Consortium (IIPC), and participates in research projects in this area. It continuously improves its technologies and methods. It currently manages over 200TB of data harvested from Czech internet websites.
Access to archived websites is regulated by the Copyright Law (No. 121/2000 Coll.). In practice, it means that the full archived contents of the Webarchive of the National Library of the Czech Republic can only be accessed through computers in the building of the National Library of the Czech Republic.
Archive of Charles University and the Web Archive of the National Library of the Czech Republic are investigating a possibility of making the full contents of the Web Archive accessible, in accordance with the law, also through at computer located at Charles University.
Archived content is accessible via the Wayback machine application.
It is likely that in the future, the number of persons interested in data mining the Webarchive content will grow. Such access for research purposes must be arranged directly with the Webarchive of the National Library of the Czech Republic.