Skip to main content

Newspaper Navigator

Search historic newspaper photos!

Explore the visual and textual content within the Chronicling America digitized newspaper collection in new ways using machine learning.

Dataset and Search Application now live!

The Newspaper Navigator Search Application

This experimental web application allows you to browse over 1.56 million images extracted from the Chronicling America database of digitized historic newspapers using machine learning. Now, you can use this tool to search the images by visual similarity by training your own machine learning classifiers!

The Newspaper Navigator Dataset

The Newspaper Navigator dataset is the outcome of the first phase of Ben Lee's project. It consists of extracted visual content for 16,358,041 historic newspaper pages in Chronicling America.

More about the Newspaper Navigator project

Newspaper Navigator is a project by Ben Lee currently under development during his time as an Innovator-in-Residence at the Library of Congress. The first stage of Newspaper Navigator is to extract content such as photographs, illustrations, cartoons, and news topics from the Chronicling America newspaper scans and corresponding OCR using emerging machine learning techniques. The second stage is to reimagine an exploratory search interface over the collection in order to enable a wide range of people to navigate the collection according to their interests. With Newspaper Navigator, Ben hopes to engage the American public, enable new digital humanities and cultural heritage research, and advance computer science research.
For more information on Newspaper Navigator, consult Ben's repo External, which includes code, a whitepaper, and demos, all of which are being regularly updated. You can also check out the archived recording of the Newspaper Navigator Data Jam, an event hosted by LC Labs in Spring 2020 to provide a sneak peek of the millions of images contained in the Newspaper Navigator dataset.

About Ben

Ben Lee is a 2020 Innovator-in-Residence at the Library of Congress, as well as a second year Ph.D. Student in the Paul G. Allen School for Computer Science & Engineering at the University of Washington, where he studies human-AI interaction with his advisor, Professor Daniel Weld. Ben graduated from Harvard College in 2017 and has served as the inaugural Digital Humanities Associate Fellow at the United States Holocaust Memorial Museum, as well as a Visiting Fellow in Harvard's History Department. He is currently a National Science Foundation Graduate Research Fellow.

Do you have ideas for digital humanities or public history projects with Newspaper Navigator?

Reach out to Ben Lee by email at belee@loc.gov

 Back to top