Get Started
- Accessing images for image analysis External on Loc.gov
- Using the Loc.gov JSON External to grab WWI Sheet Music
- Cats or dogs? External An example of exploring the Chronicling America API
- Search the Library of Congress External from your browser in one step
- Change image URLs External to rotate and resize images on loc.gov
- Extracting location data from the loc.gov API for geovisualization External with the Historic American Engineering Record
- Exploring the Meme Generator Metadata External demonstrates some of the basic things that can be done with the set of data from memegenerator.
- Exploring the GIPHY.com Metadata External demonstrates an intermediate approach to exploring the GIPHY.com data set produced by the Library of Congress.
APIs
- Loc.gov JSON API - provides data about Library of Congress digital collections. The API is a work in progress and subject to change.
- Congress.gov API - The Congress.gov Application Programming Interface (API) provides a method for Congress and the public to view, retrieve, and re-use machine-readable data from collections available on Congress.gov such as bills, amendments and committee reports.
- Chronicling America APIs - over 12 million (and growing) digitized historic newspaper pages from almost every U.S. state and territory.
- American Archive of Public Broadcasting APIs External tens of thousands of historic public radio and television programs are available for streaming and more content is added periodically. In addition, the website provides data records for approximately 2.5 million items inventoried by public broadcasting stations for this project. Further, scholars may request access to JSON and text transcripts for items in the AAPB's Online Reading Room through the AAPB Transcripts Research Access service. Credentials are required to access the transcripts API and can be obtained by contacting aapb_notifications@wgbh.org. Contact AAPB for more information about accessing the collection for digital humanities/research projects.
Get Data
- Bulk data for Congress.gov bills, bill status, and bill summaries
- MARC records - bibliographic information for most of the Library’s collections. 25 million records are available for exploration in UTF-8, MARC8, and XML formats.
- Sample MARC data set and ReadMe file
- Chronicling America Bulk OCR Data – text only
- Chronicling America Bulk Data – image, metadata, and OCR text batches
- Selected Datasets collection on loc.gov – datasets acquired by the Library for the permanent collection
- Web Archive Datasets – derivative datasets from the Library's web archives
- Computing Cultural Heritage in the Cloud Data Sandbox – The CCHC grant team devised data.labs.loc.gov as an experimental sandbox for sharing data packages compiled as part of the initiative.