Records Toolkit

Archives New Zealand’s guidance on information and records management

The Importance of Checksums

Posted by Archives New Zealand on 22 June 2017

checksums

In the light of the recent interest in checksums, here is a revised version of the blog originally published last year.

As we increasingly work in the digital world there are new areas of work for information and records managers. We need to understand our organisation’s capability to participate in these. Archives New Zealand is working towards being able to accept born-digital records and fundamental to this is the ability to produce checksums.

A checksum is a string of numbers and letters that act as a fingerprint for a file against which later comparisons can be made to detect errors in the data. They are important because we use them to check files for integrity.

Our digital preservation policy uses the UNESCO definition of integrity.

“Digital content is information encapsulated in one or more digital objects. Within this context, integrity of a digital object is the quality of its content remaining ‘uncorrupted and free of unauthorized and undocumented changes’” (National Library of Australia/UNESCO. (2003). Guidelines for the Preservation of Digital Heritage. Retrieved from http://unesdoc.unesco.org/images/0013/001300/130071e.pdf

Checksums are useful when moving files from one environment to another (e.g. validation after migration); for regularly checking the integrity of files managed in a system (where you expect the file content to remain unchanged over time); and also when working with files to uniquely identify what we are working with.

Checksums will bridge the gap, quite literally, between the organisation and permanent preservation at Archives New Zealand during transfer or deposit. A file must remain unchanged from the duplicate in your Content Management System when you extract it. We will attempt to prove that unchanged state when we store it in the Archives New Zealand digital repository. An exception procedure triggers if anything unexpected has happened. Use of checksums is also relevant for local authorities managing digital protected records.

The actual procedure which yields the checksum is called checksum generation. A generation uses one of a collection of checksum functions or algorithms. These algorithms usually output a significantly different value even for the tiniest of changes to the data. So, checksums ensure a corrupt-free transmission. They also indicate when the file has been tampered with; an important by-product of integrity is security.

We need to monitor checksums throughout the transfer or deposit lifecycle. There are two important points where we must guarantee integrity. Firstly, when we receive the files (including checksums) from your organisation and compare them to a new checksum output that we create. Secondly, when we deposit the files into the permanent repository and check them against the original transfer sent to us by your organisation. Once in the Archives New Zealand repository, we will continue to monitor the checksums to ensure the files remain unchanged in perpetuity.

Checksums can be generated and validated with many tools. Below is a list of some open source tools for your convenience:

Tool

Operating   System

Generate

Validate

URL

Free   Commander

Win

Yes

Yes

http://freecommander.com/en/downloads/

Double   Commander

Win
Linux
MacOS

Yes

Yes

https://sourceforge.net/p/doublecmd/wiki/Download/

DROID

Win
Linux
MacOS

Yes

 No

http://www.nationalarchives.gov.uk/information-management/manage-information/preserving-digital-records/droid/

AVPreserve   Fixity

Win
MacOS

Yes

Yes

https://www.avpreserve.com/tools/fixity/

Checksum-comparator

Win
Linux

No

Yes

https://github.com/exponential-decay/checksum-comparator

Spreadsheet   (LibreOffice)

Win
Linux
MacOS

No

Yes

https://www.libreoffice.org/download/download/  

SHA1SUM,   MD5SUM commands

Linux

Yes

Yes

Use in   a command line

Online   MD5 generator

Win
Linux
MacOS

Yes

No

http://www.md5.cz/

Further reading

Digital Preservation Coalition (UK) – Fixity and Checksums. Contains further reading and links to other tools.http://www.dpconline.org/handbook/technical-solutions-and-tools/fixity-and-checksums

There are many other tools out there and many internet links!

To assess your own capability, here are some questions for you and/or your organisation:

Does your organisation use checksums and if so what type?

Has your organisation used checksums in any other scenario e.g. for de-duplication?

Would your organisation be able to create a checksum comparison list like the one described?

We are very interested to hear any questions about or practices of working with checksums and will use these to produce further relevant information.

As usual, please use the rkadvice@dia.govt.nz email to contact us at Archives New Zealand.   

Post your comment

(required)

Comments

No one has commented on this page yet.

RSS feed for comments on this page | RSS feed for all comments