Since we last left off, we’ve been incredibly busy here at the archives!
For starters, we have been working on capturing the 46 miniDV tapes that were part of the Legacy Digital project’s media selection. You might remember miniDV as the small cassette tape widely used in the late 1990’s and early 2000’s. Depending on your age, you may have shot your first amateur horror film or skate video on one of these. DV (for Digital Video) is an international standard for consumer digital video created by a consortium of 10 companies, which included Sony, Hitachi, and, Panasonic, amongst others electronics giants. The DV standard uses digital technology to record picture and sound on a high density, metal evaporate tape that is enclosed in a plastic (mini!) cassette. Because miniDV is a tape-based format, it is subject to a similar sort of degradation commonly found on analog videotape, including binder deterioration and mold. In addition, since the tape width is so slim, it is particularly prone to drop-outs, head clog banding, and data loss. To make matters trickier, you’ll need a format-specific camcorder or video tape player/recorder to reformat these tapes.
Even though miniDV is oftentimes mocked or associated for its lack of picture quality (DV uses lossy compression for video, while the audio is stored uncompressed), DV was highly prominent and commercially favored because it had a comparably better image quality than the contemporaneous Hi8 or Video8 formats. MiniDV was broadly used by artists, community organizers, independent media productions, and non-profits. Its extensive use gave way to the mass popularization of non-linear editing (NLE) systems, such as Final Cut Pro, Adobe Premiere, and Avid Media Composer, which have become and continue to be, benchmark software for editing media.
The tapes we’re working on here capture a variety of activities, including the One World, One Health Congress in Brazil (2007), an interdisciplinary forum for animal and human health care and environmental science, the AHEAD (Animal Health for the Environment And Development) Forum, and video recordings of WCS’ Veterinarian programs on the field.
In order to get the content off these tapes, we needed the necessary equipment to set up a modest miniDV transferring workstation. So, we ended up enlisting the collaboration of WCS Video Producer, Jeff Morey, who as luck would have it, had a couple of miniDV camcorders in his studio that he was willing to lend to the archives. We set up a time to meet, and voilà, we were in business. Jeff was ready with a 4-pin to 9-pin IEEE 1394 FireWire cable, and a FireWire to Thunderbolt adapter that could connect to our workstation. Using Adobe Premiere Pro, we are able to capture the tape’s native DV stream. According to moving image archivists Dave Rice and Chris Lacinack in their paper, Digital Tape Preservation Strategy: Preserving Data or Video?, a DV stream contains:
- Video (NTSC or PAL)
- Audio (48, 44.1, or 33 kHz; 2 or 4 track)
- Metadata from the Source Tape
- Time Code
- Closed Captioning (as auxiliary data, not video line 21)
- Camera Metadata (iris, gain, white balance, etc.)
- Original Recording Date and Time
- Metadata from Device Read (Occurrences during playback)
When the video is captured onto a computer, it is stored as a raw DV stream that we later wrap into a Quicktime container file. Although, Quicktime is a proprietary codec, it is still widely used. This way, we ensure that we retain the original metadata gathered from the source tape, wrapped in a format that can be reproduced in virtually any contemporary video player. After transferring our tapes, naming them according to our file-naming convention, we checksum them using hashdeep, a program that will “compute, match, and audit hash sets”, and off they’ll go into storage.
Onward!
Meanwhile, we’ve also successfully upgraded our Archivematica installation from version 1.4 to its current 1.6 incarnation. Like I mentioned in our last post, we’re using Archivematica, a standards-based open-source digital preservation system, to process the archives’ digital objects. We have also been testing Archivematica’s processing configuration and our format policy registry (FPR), which will be ready to receive, verify, extract and normalize the disk images and the files contained in them, that we’ve been creating over the months. After testing seemingly endless configurations, our default processing configuration looks like this right now:
send transfer to quarantine: | No |
Remove from quarantine after (days): | N/A |
Generate transfer structure report: | Yes |
Select file format ID command (transfer): | None (This selection let’s you choose from your ID tool options. We’ve been mostly selecting ID by extension) |
Extract packages: | Yes |
Delete packages after extraction: | No |
Examine contents | Examine contents |
Create SIP: | Send to backlog |
Select file format ID command (Ingest): | Identify using Siegfried |
Normalize | Normalize for preservation |
Approve normalization | Yes |
Reminder: add metadata if desired | continue |
Transcribe files (OCR) | Yes |
Select file format ID command (submission documentation and metadata) | Siegfried |
Select compression algorithm | 7z using bzip2 |
Select compression level | Normal (you have to select something here, even though no compression is applied) |
Store AIP | None (last step where you can choose whether to continue or reject AIP) |
Store AIP location | DuraCloud aip-storage |
Store DIP location | DuraCloud dip-storage |
We’ve also been using Archivematica’s appraisal tab for the first time. The appraisal tab is a new addition to Archivematica 1.6. Its development, captured in great detail in this blog, was coordinated with University of Michigan’s Bentley Historical Library by way of a Mellon grant sponsorship.
The appraisal tab allows for the archivist to decide, mid-way through transfer, which files become part of the SIP and how they are arranged. It also allows the archivist to preview files, list the number of file formats by their PRONOM PUID (Persistent Unique Identifier), lets you visualize file formats in a tasty pie chart, search for PII (Personal Identifiable Information) and credit card numbers (using Bulk Extractor) and apply tags to objects. In addition to all of this sweet-sounding appraisal business, the tab includes an ArchivesSpace pane which allows an archivist to connect the digital content processed on Archivematica to an ArchivesSpace finding aid. Here at the archives, we were in the process of migrating our Archivists’ Toolkit finding aids to ArchivesSpace, so this feature is very much welcomed, and exciting in terms of what we can now include in our finding aids (now with an item-level digital object!).
Author of the upcoming book The Theory and Craft of Digital Preservation and digital archivist, Trevor Owens recently published in his blog the fifteen guiding digital preservation axioms. The very first axiom, “A repository is not a piece of software”, has been one of the more challenging lessons learned during this project. Owens continues:
Software cannot preserve anything. Software cannot be a repository in itself. A repository is the sum of financial resources, hardware, staff time, and ongoing implementation of policies and planning to ensure long-term access to content.
Next time I’m in the weeds, feeling conscience stricken that I’ve spent 3 hours trying to figure out the cause of that nagging normalization error, I’ll remember this, as it is part of the process and, ultimately familiarizes yourself with your system components.
On our next blog post, we’ll discuss the changes we’ve made to our FPR, working with HFS formatted disk images, and finally entering the production phase of our project!
This post highlights work that is being completed as part of our Legacy Digital Removable Media Project, which has been generously funded by the Leon Levy Foundation. For more about the project, please see this October 2016 post.
It’s a delight to know those tapes are finally getting transferred. My sincerest apologies in advance for any typos I may have put on the labels. Making those labels (and frankly working off of miniDV) was a constant headache for me so i blame the format!