An experimental search and browse application to find historic newspaper images by visual similarity (retired 2025)
Newspaper Navigator was an experimental web application that used machine learning to enable visual similarity search and browsing of over 1.56 million images extracted from the Chronicling America database of digitized historic newspapers. The experiment was conducted in 2020 by Benjamin Lee as part of the Library of Congress’ Innovator in Residence program, and was the first in-house machine learning application developed at the Library. The Newspaper Navigator application was hosted by the Library of Congress from 2020–2025. In addition to the search application, the Newspaper Navigator experiment resulted in a dataset consisting of extracted visual content for 16,358,041 historic newspaper pages in Chronicling America, as well as a machine learning pipeline and other content listed below.
See Newspaper Navigator in Action
Newspaper Navigator's Impact
Newspaper Navigator was awarded Best Digital Humanities Dataset at the 2020 DH Awards, and Best Resource Paper Runner-up at CIKM 2020.
Over its lifespan, Newspaper Navigator saw:
- 113,815 visits
- 1,500 visits monthly
- 235 Stars on GitHub
- 5,216 downloads
Learn More About Newspaper Navigator
Official products of the experiment:
- Newspaper Navigator GitHub repository
- Newspaper Navigator Data Jam recording
- Compounded Mediation: A Data Archaeology of the Newspaper Navigator Dataset
- The Newspaper Navigator Dataset: Extracting And Analyzing Visual Content from 16 Million Historic Newspaper Pages in Chronicling America
- Blog posts on The Signal
Other presentations of Newspaper Navigator by Benjamin Lee:
- Talk at Association of College and Research Libraries in 2020 ACRL ULS TULC Newspaper Navigator: Re-imagining Library Search and Discovery with Machine Learning
- AI4LAM Community Call Newspaper Navigator. B Lee
- Digital History colloquium at Humboldt University of Berlin Benjamin Lee (Universität Washington): Newspaper Navigator
About the Creator
Ben Lee was the 2020 Innovator-in-Residence at the Library of Congress. At the time, he was a second year Ph.D. Student in the Paul G. Allen School for Computer Science & Engineering at the University of Washington, where he studied human-AI interaction with his advisor, Professor Daniel Weld. Ben graduated from Harvard College in 2017 and has served as the inaugural Digital Humanities Associate Fellow at the United States Holocaust Memorial Museum, as well as a Visiting Fellow in Harvard's History Department. After his time as an Innovator in Residence, Ben became an Assistant Professor in the University of Washington's Information School.