Speech to Text Viewer

image of a sound wave — Image courtesy of Jon Gold

This proof-of-concept tool shows the potential to browse, play audio, and read time-synced auto-generated transcripts of online collections from the American Folklife Center. The selected materials span a wide range of dialects and time periods to test the accuracy of off-the-shelf transcription tools on a variety of cultural heritage recordings.

Published June 2020

Documentation

This is an experiment to test, document, and refine ways to increase accessibility to American Folklife Center collections. Our initial tests focused on readily available spoken-word collections, which provide a baseline for potentially incorporating speech-to-text services into existing digital processing workflows to enhance search or accessibility.

The test corpus includes samples of migrated audio from the 1940s, as well as contemporary born-digital oral histories created in 2010. You can test the tool yourself, read more about it on this blog post from the Signal, or access the source code at the Library of Congress' Github repo .

About the Team Behind the Speech-to-Text Viewer

Chris Adams is a Solutions Architect in the Office of the Chief Information Officer/IT Design & Development Directorate. He has worked at the Library since 2010 supporting public websites, digital preservation efforts, and the crowdsourcing initiative By the People and its open source transcription and tagging tool Concordia.

Julia Kim is a Digital Projects Coordinator at the National Library for the Blind and Print Disabled at the Library of Congress. During the course of this experiment, she was the Digital Assets Specialist at the American Folklife Center, supporting digitized and digital multi-format content for digital preservation and access workflows. Check out the Signal Blog for more posts authored by or mentioning Julia.