Why would we, immersed in the quantification of animal welfare, engage in a discussion that seems more ivory-towered academic than practical for improving the lives of billions of animals? As it turns out, access to information plays a pivotal role in our ability to gather data and metrics that directly impact animal welfare.
We are now witnessing a third revolution in information access—the first being the advent of scientific publishing, and the second, the rise of the internet. This current revolution is driven by astonishing advances in Large Language Models (LLMs), a branch of AI that can process vast amounts of information in seconds—tasks that previously required weeks of painstaking effort, even with internet assistance. However, for LLMs to perform effectively, they need direct access to scientific studies. Here is where the legacy system of paywalled publishing becomes a serious barrier, requiring a revisit of its shortcomings.
The Case Against Paywalls
The paywall system surrounding scientific articles has been a subject of intense criticism (Raman 2021; Mayoni 2022; Yup 2023). It raises a fundamental question: why is publicly funded research—reviewed by scientists who volunteer their time and often edited by unpaid academic professionals—not freely accessible to the global community?
This issue is particularly perplexing given the drastically reduced costs of digital distribution compared to the historical expenses of printing and mailing physical journals. In the digital era, maintaining paywalls serves as an outdated gatekeeping mechanism rather than a practical need.
Critics of open access often point to the financial sustainability of high-quality journals and the role of publishers in maintaining rigorous peer-review processes. These concerns are valid and deserve consideration. Nevertheless, the magnitude of the problems caused by paywalls compels the need for urgent solutions.
A Barrier to Training LLMs
Large Language Models (LLMs), such as OpenAI’s GPT models, are trained on vast datasets containing various types of information, including books, freely available research papers, news articles, blogs, and online forums. However, when high-quality, peer-reviewed scientific literature is locked behind paywalls, it limits the quality and comprehensiveness of the training data available to these models. Instead, LLMs must rely disproportionately on secondary or alternative sources. While these can be valuable, they often lack the rigor, depth, and reliability of peer-reviewed studies.
This reliance on incomplete datasets creates a cascade of challenges. Without direct access to many primary scientific studies, LLMs may lack critical insights into emerging or specialized topics, such as advancements relevant to animal welfare. A strong reliance on secondary sources may introduce bias or misinformation that LLMs inadvertently replicate in their outputs, as blogs and forums often reflect opinions rather than scientific debate. Furthermore, with limited access to cutting-edge research, LLMs are restricted in their ability to support decision-making effectively, whether in policy-making, education, research or practical interventions to improve animal welfare.
The Need for Action
To address these issues, it is necessary to break down barriers to knowledge and ensure LLMs can rely more heavily on a foundation of reliable, evidence-based data. One approach is to advocate for unrestricted access to scientific literature by AI models. This is not just about improving AI; it is about promoting humanity’s fundamental right to freely access the wealth of knowledge it has collectively produced. Solutions might involve compensating publishers for access, as evidenced by recent precedents. For example, in 2024, Wiley, a prominent scientific publisher, entered into a $23 million contract with a major technology company to grant access to its academic and professional book content for training LLMs. Agreements like these could provide a model for broader implementation. Regardless of the specific mechanism, ensuring that AI responses are grounded in the best available evidence is essential for fostering trust and reliability in these systems.
At the same time, it is imperative to encourage open and free access to animal welfare studies. While broader access to scientific literature remains a work in progress, prioritizing the open publication of studies relevant to pressing global issues, such as animal welfare, is a critical step. Research in fields like veterinary science, neuroscience, pain science, and animal production should be made readily accessible through open-access journals or other freely available platforms. This ensures that such studies are immediately available to AI training and contribute to the broader dissemination of knowledge.
The Broader Implications
The stakes are high. By limiting access to scientific knowledge, we not only stifle innovation and informed decision-making but also risk reinforcing existing inequities in education and research. This limitation creates a ripple effect: AI systems trained on incomplete or biased datasets replicate and perpetuate these gaps, leaving them less capable of addressing the complex challenges facing our world.
In the context of animal welfare, such restrictions can mean missed opportunities to develop better insights and solutions to improve the lives of billions of animals. For those committed to aligning AI with animal welfare values, advocating for the unrestricted availability of scientific literature and ensuring open access to animal welfare research are essential steps to achieving this critical goal.
I understand where you're coming from but I wonder whether this would also have negative consequences. Perhaps it would increase the pace of AI development. It would make LLMs more useful, which might increase investments into AI even more. And maybe it would also make LLMs generally smarter, which could also accelerate AI progress (this is not my area, I'm just speculating). Some EA folks are protesting to pause AI, increased progress might not be great. It would help all the research, but not all research makes the world better. For example, it could benefit research into more efficient animal farming, which could be bad for animals. Considerations like these would make me too unsure about the sign of the impact to eagerly support such a cause, unfortunately.
Indeed, like any technology, we must be vigilant about potential negative consequences. For example, back in the day, we were among those who signed against experiments involving the creation of potential pandemic pathogens—a stance that history has since validated, as we now know all too well. However, I do not view large language models (LLMs) in the same light. I believe LLMs will inevitably become a primary source of information for society, and this can be a very positive development. One way to guide this technology toward beneficial outcomes is by feeding it original scientific sources that have already been published.
Regarding the impact of AI on animal welfare, this is, of course, a critically important topic. We wrote a piece on our position on this some time ago but hadn’t published it until now. Motivated by your comment, we plan to do so in the coming days, and I would appreciate your thoughts on it once it’s available.