AI at LC

three men standing in a room of computers — From left to right: George R. Perreault, head of the Library of Congress Data Processing Offiice, standing at the computer storage unit; Ernest Acosta Jr., digital computer programmer, working at the card reader unit; and Joseph B. Murphy, digital computer programmer, inserting a new tape in one of the tape units. Jan. 20, 1964. Item 1333, box 69, Photographs, Illustrations & Objects, Library of Congress Archives, Manuscript Division, Library of Congress, Washington, D.C. Read more about the history of computing at the Library in this blog post.

Experimenting with artificial intelligence and machine learning at the Library of Congress.

Since 2018, the Library of Congress has been researching and experimenting with artificial intelligence and machine learning, focusing on ethical uses of these technologies, and addressing the challenges of their adoption in libraries and cultural memory organizations.

Below are resources from expert consultations, explorations with Library staff and users, and experiments demonstrating how automated technology can enhance collections, operations, and services.

LC Labs Artificial Intelligence Planning Framework

LC Labs has been developing a planning framework to support the responsible exploration and potential adoption of AI at the Library. At a high level, the framework includes three planning phases: 1) Understand 2) Experiment and 3) Implement, each supports the evaluation of three elements of ML: 1) Data; 2) Models; and 3) People. We’ve developed a set of worksheets, questionnaires, and workshops to engage stakeholders and staff and identify priorities for future AI enhancements and services.

Access the Framework directly on GitHub and read more about it in this post on the Signal Blog.

Worldwide Engagement

The Library of Congress has been engaging in international, federal, and sector-wide community conversations about AI for years, including participation in AI conferences as part of the International Federation of Library Associations (IFLA), and as a member of the Secretariat of the international AI4LAMs community alongside other National Libraries and major research institutions.

Within the federal community, Library staff have joined federal communities of practice, including the Equitable Data Community of Practice, the Congressional AI Advisory Group, and the AI Community of Practice hosted by GSA, including a subgroup chaired by a Library staff member dedicated to developing requirements for vendors to adhere to when using Natural Language Processing (NLP).

Active Artificial Intelligence Use Cases

Use cases for artificial intelligence at varying stages include:

Creating machine-readable text from digitized documents using Optical Character Recognition (OCR) to support search and discovery of collections and content online.
Creating standardized catalog records from eBooks and other digital material- testing different machine learning (ML) models to generate data for bibliographic records, measuring the quality of outcomes, and understanding the use of ML in the cataloging processing.
Extracting Historic Copyright Data - experiment to test multiple machine learning (ML) models with a humans in the loop approach in hopes of producing machine-readable data from historic Copyright records.
Parsing legislative data - experiment to to test machine learning (ML) models in creating geographic place and subject terms for legislative data with an emphasis on measuring the quality of outcomes and analyzing the use of ML in the larger legislative data workflow that supports analysist in delivering efficient and accurate services.
The National Library Service for the Blind and Print Disabled is experimenting with available machine learning (ML) models to synthesize and compress lengthy book descriptions into succinct and engaging content for patron discovery.

Experiments to Date

Speech to Text Viewer: proof of concept tool testing off-the-shelf transcription tools
Exploring ML with the Project Aida team: six explorations of how machine learning could be applied to the Library's digital collections
Experimental Access: exploring experimental ways of providing access to the Library's digital collections
Humans in the Loop: an experimental humans in the loop workflow for pairing human decision-making with automated processes
Newspaper Navigator by 2020 Innovator in Residence Ben Lee
Citizen DJ by 2020 Innovator in Residence Brian Foo
America’s Public Bible: Machine-Learning Detection of Biblical Quotations Across LOC Collections via Cloud Computing by CCHC Research Expert Lincoln Mullen
Access & Discovery of Documentary Images by CCHC Research Expert Lauren Tilton
Situating Ourselves in Cultural Heritage: Using Neural Nets to Expand the Reach of Metadata and See Cultural Data on Our Own Terms by CCHC Research Expert Andromeda Yelton
Exploring Computational Description: investigating how machine learning can help with cataloging

Reports and Presentations

The Machine Learning + Libraries Summit: Event Summary includes more detailed information about the Machine Learning + Libraries Summit hosted by LC Labs in September 2019.
Digital Libraries, Intelligent Data Analytics, and Augmented Description: Final Report details exploratory projects conducted by the Project Aida Team at the University of Nebraska Lincoln in collaboration with LC Labs and addresses social and technical challenges that are critical context for the development of machine learning in the cultural heritage sector.
Machine Learning + Libraries: A Report on the State of the Field - LC Labs commissioned Ryan Cordell, Associate Professor of English at Northeastern University, to write this report on the “state of the ﬁeld in machine learning and libraries.”
Feasible, Adaptable and Shared: A Call for a Community Framework for Implementing ML and AI" (2022) by Abigail Potter, Meghan Ferriter, Eileen J. Manchester, and Jaime Mears. Proceedings of the 18th Internatonal Conference on Digital Preservaton 2022, p. 145
Exploring Computational Description (ECD1) Executive Summary

Artificial Intelligence Governance

Our current policies guide the use of technology to meet the agency’s mission, encouraging the adoption of tools and technology that will improve our ability to meet the information needs of Congress and the American people. As experiments with AI move to additional planning and implementation, policies and governance frameworks will be updated to reflect the particular challenges and opportunities presented by these tools. This includes Library policy reviews and updates and new policy evaluations.

Over the past several years, we have been developing a framework for AI decision making that aligns closely with the AI Risk Management Framework from the National Institute for Standards and Technology (NIST) and recommendations of the Office of Management and Budget in Memorandum M-21-06 titled "Guidance for Regulation of Artificial Intelligence Applications" to work towards voluntary sector-based standards, to make data available, and to communicate with the public.