Skip to main content

By the People Data Sets

Branch Rickey and son look at list of minor league teams on blackboard

Download and use transcriptions of LC digital collections by crowdsourcing volunteers.

Data Sets

Transcription data was created from completed By the People campaigns is available here in bulk as zipped .csv files. The text in this data set was created by volunteers and can be used in many different ways. All contributions to the By the People application are released into the public domain as they are created. Anyone is free to use and reuse this data set in any way they want.

  • Branch Rickey Scouting Reports Download External - The data available here is from Branch Rickey scouting reports, part of legendary baseball scout Rickey’s Baseball Files available in his archival papers. In a career spanning nearly 60 years, Rickey was a player, manager, executive, and part-owner. He is perhaps best known as the man responsible for bringing Jackie Robinson into major league baseball in 1947, breaking baseball’s long-established color barrier. The Branch Rickey data includes transcriptions of 1,926 pages of scouting reports. The zip file include .csvs of the original “raw” data as well as versions of the data sorted and augmented by the LC Labs team. You’ll also find a README file that explains the data creation and field names. We wrote about this data release and some initial explorations in this Library of Congress blog post.

About By the People

By the People invites students and lifelong learners to contribute to the Library of Congress as virtual volunteers -- transcribing, reviewing, and tagging historical texts to improve search and accessibility of Library of Congress digital collections.

By the People is an instance of Concordia External, an open source platform developed by the Library of Congress. Development is ongoing, iterative, and leverages user-centered design following the key principles of trust and approachability.

We need you

We are interested in hearing from users of these data sets – what you did with them, feedback on the documentation, what other formats or transcribed collections might be of interest, as well as any other feedback or comments you have for us. Please write to us via our Contact Us form - we'd love to hear from you!

 Back to top