The State of Digital Media Data Research, 2024



The purpose of this report is to reflect on the state of digital media data research in 2024. This is the second in a series of reports on the state of digital media research, which we originally published in 2023. We reflect on changes to digital media research since our report in 2023. Specifically, we highlight the following trends: 1. From 2023 to 2024, access to digital media data changed drastically. Researchers were largely priced out of the Twitter API, and Pushshift–a commonly used archive for Reddit data–went private to comply with Reddit’s API policies. Meta also announced the imminent sunsetting of CrowdTangle, a transparency tool popular amongst researchers and journalists alike. At the same time, however, many platforms announced academic programs for data access, including the YouTube researcher program, TikTok’s Research API, and the Meta Content Library. 2. Federated social media platforms became more popular. Following Elon Musk’s purchase of Twitter, Twitter users flocked to Mastodon, Threads, BlueSky, and other federated (or soon to be federated) platforms. This presents unique challenges for researchers studying digital media data. As new platforms are created, researchers must build new tools to analyze them or wait for third parties or the platforms themselves to make data available. 3. Generative AI’s explosion may change how we study digital media. First, researchers using computational methods to measure social media content have turned to OpenAI’s ChatGPT and other Large Language Models (LLMs) to classify content. Second, researchers and civil society groups are increasingly concerned about the possibility for Generative AI to flood the information environment with fake content. 4. In February 2024, the EU Digital Services Act (DSA) went into effect, mandating that large platforms give researchers near real-time access to public data. We don’t yet know how these policies will impact data access in the United States, and it remains unclear what this data access will look like in practice. In the United States, legislative efforts to mandate researcher access stalled. While the last year brought many welcome and unwelcome changes to digital media data research, the findings in this report renew our encouragement that digital media data research should be guided by collaboration, transparency, preparation, and consistency.

LCSH Subject Headings