Digitizing Historical Photographs

The Municipal Archives is noted for its vast collection of photographs, currently totaling more than two million images. Dating back to the early 1900s when City agencies first adopted photography to document their work, they extend to the present day. Most are traditional prints and negatives in various sizes and formats. About fifteen years ago, the Archives started digitizing these images and making them available on-line. This has made our patrons very happy (and begging for more!), but it also means that the Archives is now responsible for preserving all the digital surrogates as well as the original images.

Two sides of a trifold brochure created for Open House NY.

Matt Minor, Municipal Archives photographer, writes this week’s blog describing the considerable work that goes into creating, storing, and describing digital images. And he might just provide some good answers for those who ask … “why don’t we just digitize everything?”

Matt Minor: I joined the Municipal Archives in 2011. One of my earliest assignments was to digitize the Municipal Archives Photograph Collection (MAC). Archival records, including photographs, are usually organized in collections named for the creating office or individual, e.g. Mayor LaGuardia Photograph Collection. MAC is the exception. These are photographs acquired over several decades from numerous, and in many cases, unknown sources, and assembled together by the archivists as an “artificial” collection. They are truly eclectic in subject and many are quite evocative. Below is mac_1689, a photo of lions in the Bronx Zoo.

mac_1689: The Bronx Zoo: The African Plains exhibit. Six uncaged lions face viewers across an invisible moat.

One of the goals in creating a digital copy is to convey the photographer’s original intention. In this instance, the original negative did not exist, but I had a vintage print to help guide me on what the photographer wanted the viewer to see. I saved the photo as a color file rather than black and white to show the warm tone and give the viewer a sense of its age. I saved it as a jpeg because almost any browser or photo viewer can read jpegs.

Making a web-ready jpeg in Photoshop

Digitizing colonial records using an 80-megapixel overhead camera system.

But how do photographers like me, in an archival institution, produce high-quality scans? We start with many of the same questions other photographers ask at the beginning of a job. How can I light my subject so that I don’t get shadows where I don’t want them? How should I set my aperture and shutter for the best exposure? Is the subject in focus? Am I getting the resolution I need? And throughout the process, we rely on our eyes, asking each time we take a photo, “Does everything look right?” But this is only how it starts.

The Back End

In today’s world, most people take the storage of their photos for granted. Your iPhone pictures are automatically backed up. But things get complicated if you want people a hundred years from now to find a specific photo, know what it is and where it came from, and be able to look at it on whatever system they’re using.

The archival master file viewed in Photoshop

In the Archives-world, we create what is called a “preservation master” image. This often appears very different from the picture the viewer sees in the digital gallery. Here are some differences in the archival master above. There’s less contrast overall, even though the original image still has deep shadows. I had to make sure no information was lost, so the highlights and shadows both had to be within the range that the camera’s sensor would capture. I composed the shot so the entire print would fit in the frame, not just the image—that way, a researcher will know this is the whole item. Finally, to the left of the image you’ll see a Golden Thread target, which is used to make sure color, exposure, and focus met laboratory standards. The target is retained in the picture, so those standards can be checked. The target also has inches and centimeters marked, for scale.

Using high-grade color targets allows us to accurately represent color and ensure high resolution and focus. Here the scan is shown in Capture One CH, our main software tool for digital capture.

File naming in action. In this instance, the filenames let us know that the item is from the Municipal Library, its call number, the date it was published, and which page each photo shows.

Then the file needed a name so it would be findable among millions of others. The Archives’ MAC collection is small so naming was easy; but as we are now digitizing many non-photographic items, we’ve had to come up with a stricter set of file naming rules so that everything is organized and findable.

You might also notice some things about the file itself. The web-ready photo at the beginning of this blog is about one megabyte—a good size for social media. The archival master, though, is a whopping 198 megabytes—far too big for Facebook or Instagram. Why would we want such a big file taking up storage space? The answer is simple: we want to do this digitization work only one time. We scan at the highest possible resolution and then create derivative files at a lower resolution, as appropriate for the desired purpose, e.g. social media, duplication for publication, etc.

We also have to create the metadata for the image, e.g. information about the original item, letting researchers know who created it, what it shows, which collection it belongs in, copyright status, etc.

Finally, we have to store the digital file for the long term—preferably on a server that will be automatically backed up to other servers in different locations, so that nothing will get lost or corrupted.

This brings us back to that question—why don’t we just digitize everything? As you can see, there’s a lot more that goes into archival digitization than what you see on the front end. On the front end a digital photograph is visual. It’s a picture with detail, contrast, color, tonality, sharpness, depth of field, and focus. On the back end, it’s a huge packet of information—it’s three channels, millions of pixels, and a precise value for each. Even a blog post like this one is just a brief glimpse into the world of cultural heritage photography.