Hide table of contents

If you want to explore the final product of this project, please visit: https://pinktaty.github.io/EpidemiologicLibraryWeb/

This project was carried out as part of the “Carreras con Impacto” program during the 14- week mentorship phase. You can find more information about the program in this entry.

 

Summary
A website called "Epidemiologic Library" was developed to gather information on infectious diseases in Mexico over the past 25 years. The main objective is to facilitate the analysis of epidemiological data to identify trends and, based on them, to create risk management plans for future health crises.

 

Problem Description
Pandemics are considered a Global Catastrophic Risk due to their potential to cause devastating consequences for society, the economy, and, most critically, public health, with the loss of life being the most severe impact. A recent and significant example of this impact is the COVID-19 pandemic, which, by 2021, had resulted in more than 2 million deaths worldwide. Moreover, in the past decade, the frequency of new epidemiological outbreaks has increased due to factors such as environmental changes and technological capabilities, suggesting that the likelihood of future pandemics could rise in the coming years.

To effectively manage future pandemics, it is crucial to analyze data from recent decades to identify potential trends and prepare adequately for future health crises. However, although having this information is vital, there are currently no adequate structures for its collection. The available data are often in formats that do not facilitate large-scale analysis, such as reports in PDF format rather than structured databases, which significantly slows down the analysis process and, consequently, the ability to respond in real-time or create risk management plans.

For this reason, the "Epidemiologic Library" project aims to use epidemiological data from Mexico as a pilot, collecting the available information from the past 25 years and making it accessible in formats suitable for databases. This will save time and improve the efficiency of analysis by leveraging computational techniques such as web scraping.


Project Description

The project aims to address the area of biological risks by recording responses to diseases in the last 25 years in Mexico. By organizing the data, its analysis in order to identify patterns will be facilitated. By understanding how diseases emerge and spread we can strengthen our ability to anticipate and address the challenges they pose, thereby preparing a more effective and adaptive response to future public health emergencies.

Mexico has been chosen for its considerable population and diverse epidemiological history. The collection of data will help the future analysis of events in order to not only facilitate the response of Mexico in the face of these risks, but from other countries: comparing our response against theirs will facilitate the creation of modified strategies according to their particular cases. In addition, this data collection will be accessible through an interactive web page with the aim of reaching biosafety professionals, making the information available to the individual so that they can use it for their investigations, data analysis and/or decision-making.

 

Information Sources

All the information collected was from PDF documents made by the Undersecretary of Prevention and Health Promotion of Mexico and the  General Directorate of Epidemiology of Mexico. 

 

Motivation

Since the epidemiological information is presented in PDF format by the issuing organizations, this project addresses the need for a presentation that is conducive to effective data analysis, so that researchers can focus on generating strategies and knowledge in a faster and more optimal way, thus helping to save time that now can be spended in achieving the main goals of their projects instead of using it to collect data.

Project goals

  • Create a solid data source as a website on epidemic management in the last 25 years in Mexico for future analyzes.
  • Increase the availability and transparency of epidemiological information in Mexico.

Personal goals:

  • Demonstrate my capacity of problem resolution through strategic and logical thinking as well as programming.
  • Demonstrate my skills to plan and develop a project in a limited time.

 

Methodology

Following the proposed theory of change, an initial survey was conducted with biosafety professionals to identify their information needs and preferences for features and design of the web page. Based on the survey results, a sketch of the final product was created.

After this, Carreras con Impacto mentored me to select the diseases with the most epidemiologic value to gather information about them. For the pdf scraping, Python's pdfplumber library was used, along with the ChatGPT API to process and interpret the scraped information, and the Google Sheets API to collect and organize the data. The code can be accessed through this link: https://github.com/pinktaty/EpidemiologicLibrary

 

Results

The final project is the Epidemiologic Library web, accessible through this link: https://pinktaty.github.io/EpidemiologicLibraryWeb/. This web page is the repository of the information collected during the development of the project in format xlsm.

This project was concluded successfully, providing valuable insights into the importance of well-structured information and how seemingly simple details can significantly impact the time and resources required for an investigation or project. Through this experience, I developed skills in data scraping, information organization, and user-centered design.

Limitations

Since this project was executed by a single person, its scope was limited to what could be accomplished within three months. Additionally, tracking the origin of information presented outside of government platforms proved to be challenging, as the links often led to non-existent pages despite supposedly originating from government sources. It is essential to ensure transparency and secure access to information regardless of the year it is accessed,  as there is a noticeable trend that the further back in time the information is, the harder it is to find an existent link on government platforms.

Perspectives

The project still has potential to grow further: it can be expanded with information on more diseases and kept up-to-date with upcoming outbreaks. Additionally, interactive graphs could be created and more features added, but this requires additional work beyond the initially established development timeframe.

Comments


No comments on this post yet.
Be the first to respond.
Curated and popular this week
Sam Anschell
 ·  · 6m read
 · 
*Disclaimer* I am writing this post in a personal capacity; the opinions I express are my own and do not represent my employer. I think that more people and orgs (especially nonprofits) should consider negotiating the cost of sizable expenses. In my experience, there is usually nothing to lose by respectfully asking to pay less, and doing so can sometimes save thousands or tens of thousands of dollars per hour. This is because negotiating doesn’t take very much time[1], savings can persist across multiple years, and counterparties can be surprisingly generous with discounts. Here are a few examples of expenses that may be negotiable: For organizations * Software or news subscriptions * Of 35 corporate software and news providers I’ve negotiated with, 30 have been willing to provide discounts. These discounts range from 10% to 80%, with an average of around 40%. * Leases * A friend was able to negotiate a 22% reduction in the price per square foot on a corporate lease and secured a couple months of free rent. This led to >$480,000 in savings for their nonprofit. Other negotiable parameters include: * Square footage counted towards rent costs * Lease length * A tenant improvement allowance * Certain physical goods (e.g., smart TVs) * Buying in bulk can be a great lever for negotiating smaller items like covid tests, and can reduce costs by 50% or more. * Event/retreat venues (both venue price and smaller items like food and AV) * Hotel blocks * A quick email with the rates of comparable but more affordable hotel blocks can often save ~10%. * Professional service contracts with large for-profit firms (e.g., IT contracts, office internet coverage) * Insurance premiums (though I am less confident that this is negotiable) For many products and services, a nonprofit can qualify for a discount simply by providing their IRS determination letter or getting verified on platforms like TechSoup. In my experience, most vendors and companies
 ·  · 4m read
 · 
Forethought[1] is a new AI macrostrategy research group cofounded by Max Dalton, Will MacAskill, Tom Davidson, and Amrit Sidhu-Brar. We are trying to figure out how to navigate the (potentially rapid) transition to a world with superintelligent AI systems. We aim to tackle the most important questions we can find, unrestricted by the current Overton window. More details on our website. Why we exist We think that AGI might come soon (say, modal timelines to mostly-automated AI R&D in the next 2-8 years), and might significantly accelerate technological progress, leading to many different challenges. We don’t yet have a good understanding of what this change might look like or how to navigate it. Society is not prepared. Moreover, we want the world to not just avoid catastrophe: we want to reach a really great future. We think about what this might be like (incorporating moral uncertainty), and what we can do, now, to build towards a good future. Like all projects, this started out with a plethora of Google docs. We ran a series of seminars to explore the ideas further, and that cascaded into an organization. This area of work feels to us like the early days of EA: we’re exploring unusual, neglected ideas, and finding research progress surprisingly tractable. And while we start out with (literally) galaxy-brained schemes, they often ground out into fairly specific and concrete ideas about what should happen next. Of course, we’re bringing principles like scope sensitivity, impartiality, etc to our thinking, and we think that these issues urgently need more morally dedicated and thoughtful people working on them. Research Research agendas We are currently pursuing the following perspectives: * Preparing for the intelligence explosion: If AI drives explosive growth there will be an enormous number of challenges we have to face. In addition to misalignment risk and biorisk, this potentially includes: how to govern the development of new weapons of mass destr
Dr Kassim
 ·  · 4m read
 · 
Hey everyone, I’ve been going through the EA Introductory Program, and I have to admit some of these ideas make sense, but others leave me with more questions than answers. I’m trying to wrap my head around certain core EA principles, and the more I think about them, the more I wonder: Am I misunderstanding, or are there blind spots in EA’s approach? I’d really love to hear what others think. Maybe you can help me clarify some of my doubts. Or maybe you share the same reservations? Let’s talk. Cause Prioritization. Does It Ignore Political and Social Reality? EA focuses on doing the most good per dollar, which makes sense in theory. But does it hold up when you apply it to real world contexts especially in countries like Uganda? Take malaria prevention. It’s a top EA cause because it’s highly cost effective $5,000 can save a life through bed nets (GiveWell, 2023). But what happens when government corruption or instability disrupts these programs? The Global Fund scandal in Uganda saw $1.6 million in malaria aid mismanaged (Global Fund Audit Report, 2016). If money isn’t reaching the people it’s meant to help, is it really the best use of resources? And what about leadership changes? Policies shift unpredictably here. A national animal welfare initiative I supported lost momentum when political priorities changed. How does EA factor in these uncertainties when prioritizing causes? It feels like EA assumes a stable world where money always achieves the intended impact. But what if that’s not the world we live in? Long termism. A Luxury When the Present Is in Crisis? I get why long termists argue that future people matter. But should we really prioritize them over people suffering today? Long termism tells us that existential risks like AI could wipe out trillions of future lives. But in Uganda, we’re losing lives now—1,500+ die from rabies annually (WHO, 2021), and 41% of children suffer from stunting due to malnutrition (UNICEF, 2022). These are preventable d