stacksimage

From Investigation to Implementation

Building a Program
for the Large-Scale
Digitization of Manuscripts

Reprocessing

Reprocessing a collection can be time-consuming and for this reason will generally be omitted from a large-scale digitization program of SHC collections. In some cases, however, reprocessing will be necessary, given the current state of the collection and its ease of use once digitized. For the Thomas E. Watson Papers Digital Collection, we decided to reprocess the archival collection because we wanted our pilot large-scale digitization project to be as straightforward as possible, and a high volume of additions to the collection over the years had left it in a confusing organizational shape. The grant-funded nature of this project allowed for this level of attention, but for future large-scale digitization efforts, reprocessing will be avoided.

Statistics

It took the full-time project manager and half-time graduate student assistant four weeks to reprocess the 27.5 linear foot collection, which included integrating many additions, rearranging the series in the collection to better follow current archival standards, and updating the collection finding aid to reflect these changes. Using these numbers, reprocessing for the collection was undertaken at a rate of 44 minutes per linear foot of archival materials.

Total Linear FeetTotal TimeTime Per Linear Foot
27.5~ 200 hrs44 minutes

Reprocessing for Digitization and More Product, Less Process: At Odds?

One potential area for conflict is balancing the need to digitize archival collections quickly and with low levels of attention from archivists with the current trend in archival processing towards "light processing." What can seem like a managable volume of materials with aggregate description in its original physical form—a box of receipts, for example—could result in hundreds or even thousands of scanned images for a researcher to wade through. Ease of use of the digital collection for the end-user will need to be weighed against the time and resources necessary for reprocessing when considering archival collections for digitization.