Librarians look to store social media data for future research

GW librarians are expanding a tool they created to archive posts from across social media platforms after initial success in exporting tweets.

A team of librarians created a social feed manager to collect and store mass amounts of Twitter data for researchers to use. A $130,000 grant from the National Historical Publications and Records Commission will allow them to expand the scope of the application from Twitter to other social media sites.

The application pulls data that faculty can use to study, for example, how congressional candidates use Twitter and how potential voters respond. It can collect up to 3,200 tweets at a time and export them to accessible, searchable Word documents, creating new ways to track social media trends.

The grant will allow the team to archive information from Tumblr or YouTube in special collections for the first time.

“With this project and the service we are providing with it, we are working to empower researchers on campus – including students – to move more quickly from a research idea to a full investigation,” said Daniel Chudnov, Gelman Library’s director of scholarly technology.

Previously, researchers had to parse through thousands of tweets by hand and individually copy and paste them into Word documents to use the information for research projects. The sheer amount of data, limited storage and a lack of coding experience presented challenges to researchers looking to use Twitter, Chudnov said.

“We are working to alleviate the information drudgery of schlepping data around, just like libraries have always done, so researchers on campus can jump more quickly into applying the tools and methods of their disciplines and asking a wide range of questions,” Chudnov said.

The team will hire two additional part-time employees, a programmer and an archive policy specialist. It will also create an advisory board comprised of technical experts, librarians and archivists from a number of institutions.

Aside from facilitating research, the program also focuses on preserving current social media for the future. As new networking platforms become popular, old ones, such as MySpace, are lost as people delete their profiles and the related information.

“If we don’t start collecting this media with this long-term focus now, there’s a good chance it’ll be lost to us. That includes documenting life at GW as it appears in social media for future historians,” Chudnov said.

The Hatchet has disabled comments on our website. Learn more.