Thousands use GW Libraries’ social media archive tool six years after debut

Media Credit: Hatchet File Photo

Geneva Henry, the dean of libraries and academic innovation, said more than 3,000 users have employed different search strategies to pore over more than 15 million tweets.

More than 3,000 users have taken advantage of a tool GW Libraries officials created to help researchers archive and analyze social media posts over the past six years.

Developed in 2013 by a group of software developers, archivists and librarians, the Social Feed Manager collects data from Twitter, Tumblr, Flickr and Sina Weibo – a Chinese blogging website – and exports and organizes the data for researcher use. The tool has helped researchers obtain hundreds of thousands of social media posts they can use to study, according to the Social Feed Manager website.

Geneva Henry, the dean of libraries and academic innovation, said users can query the archived posts by characteristics like account and keyword. Library officials received a $130,000 grant from the National Historical Publications and Records Commission in 2014 to expand the tool’s functionality from Twitter alone to several websites.

“GW Libraries and Academic Innovation developed and maintains this software project as part of our commitment to preserving the human record,” Henry said in an email.

Since the tool’s inception, more than 300 members of the GW community have gathered more than 1 million tweets through the tool, according to Social Feed Manager’s website. More than 2,600 total researchers have analyzed 15 million tweets in the same time, the website states.

The 2014 grant funded the project through 2017, according to the feed manager’s website. Henry said the project is now financially supported by the Laura Bush 21st Century Librarian Program.

Library staff have updated the tool several times since its creation, including a new feature added last month to allow researchers to locate tweets by searching on the database by a specific language, the website states.

Henry said members of GW Libraries’ “scholarly technology team” have taught staff at other academic institutions, like Virginia Tech and Stanford University, how to use the Social Feed Manager for research.

She added that staff members work with researchers to find posts for individual projects and pursue posts on topics of “strong interest,” like the 2016 and 2020 presidential elections. The tool includes an archive of 283 million tweets related to the 2016 presidential election and 88 million tweets so far related to the 2020 presidential election, Henry said.

“Researchers throughout GW, the country and the world rely on SFM to support their research using social media as data,” she said.

She said library staff have also developed ethical and privacy guidelines to inform researchers on their projects.

Researchers have used the Social Feed Manager to gather data on projects examining how elite voices shape politics and how social media influences politics in Iran in the past year.

Henry declined to say what feedback officials have heard about the tool.

Research experts said the tool is a way to provide researchers and historians access to data cataloging millions of reactions and opinions.

Panagiotis Takis Metaxas, a professor of computer science at Wellesley College, said collecting social media data is helpful in identifying fake accounts and studying the dissemination of misinformation, which researchers can especially use ahead of the 2020 presidential election.

He said an early example of political disinformation used online was the 2010 Massachusetts special election for a vacant U.S. Senate seat after the death of Sen. Ted Kennedy, D-Mass., adding that social media data helped him discover what led to the surprise victory of a Republican candidate in a liberal state.

“Because we happened to be collecting social media data, we could find some partial answer about how this might have happened, and we discovered that there were groups of people who were launching automatic social media bots, trying to propagate lies,” Metaxas said.

Metaxas said GW Libraries should present visualizations alongside the information it collects so the data are easily digestible for readers.

“You will see that we have developed already, visualizations that will give people an understanding of what was the dialogue,” Metaxas said. “Sometimes a visualization could be useful for the use of the data that the library is collecting.”

Zachary Brodt, an archivist at the University of Pittsburgh, said researchers can use the data to analyze the development of political and social movements and reactions to those movements.

“Archiving the use of specific hashtags can show us the origin of an idea or movement, track its dissemination throughout a social media platform and allow us to better understand how that movement was received,” Brodt said in an email.

He said time will further validate social media’s value in research because significant social movements like Black Lives Matter and #MeToo arose from social media platforms.

“When websites started to become a source of information for historical research, I think they were looked at skeptically at first, but now it is not uncommon to pick up a book and find websites cited in footnotes or bibliographies,” he said. “I think the same will hold true for social media archives.”

The Hatchet has disabled comments on our website. Learn more.