What 21,000 WhatsApp messages reveal about AI utility in extreme poverty contexts

GiveDirectly

What 21,000 WhatsApp messages reveal about AI utility in extreme poverty contexts

GiveDirectly

2 min readJun 16

Comments 13

Sorted by

New & upvoted

Clara Torres Latorre 🔸

What do you mean by "tested"? What outcomes are you measuring / or do you plan to do surveys...?

GiveDirectly

Hi! We're taking a phased approach to our learning on digital innovation. These results are from the first phase of work - small pilots assessing uptake and usability through qualitative research (focus groups) and data from the chatbots. We are now running A/B tests of different prompts to see which ones respond most effectively to recipients. Once we know which prompts perform best, we will move on to A/B testing of cash versus cash plus chatbot to evaluate economic indicators that we believe from the micro-data of past studies are leading indicators of longer-term income.

NickLaing

4w*

This is super interesting stuff thanks for posting!

The first thing that jumped out at me was that you are reading and analysing people's messages that come through the chatbot. I'm sure they consented (as much as this is possible to truly consent with the level of education, and the cash incentive) and its all anonymised but it still seems weird.

I have so many ethical questions about this. None of them I think necessarily mean something like this isn't worth trying, but I think it's worth discussing. Here's just a couple off the top of my head

What do you do if they have a conversation about harming themselves or others? Do you react and do something about it or do you leave it be? Would people then be aware that what they type could illicit some kind of external response?
When they ask something like ""What business has quick profits" for which there is obviously no good answer, what does the bot do? I hope it doesn't try and give business advice. When I asked Claude sonnet, the first answer it gave was

"Poultry farming is one of the most cited options. It has high demand for eggs and chicken meat, with startup capital of around 1–2.5 million RWF and potential returns in 2–3 months if well managed."

In many rural contexts this might be a decent idea, but without proper disease treatment, housing, protection from theft etc. this advice could be a huge liability.

Also Who are you that you answer me?" is pretty haunting. I concur.

I think there's a huge amount to be gained potentially by trying chatbots in these settings, but its a bit of an ethical minefield and its a new fronteir for sure.

GiveDirectly

Hi NickLaing, you raise excellent points - this does raise ethical issues and it's something that we have thought a lot about and have put systems in place to address. We do gather consent from recipients when they are initially enrolled, and we ensure that data is anonymised when it is analysed. We have built additional railguards into the chatbots (we tested exactly the types of questions you raised to check how the chatbot responded and then ensured that any questions that raised a red flag receive responses that link them to our call centre). Our safeguarding teams also monitor messages for anything problematic and address these with the appropriate level of engagement depending on the issues that surface. Our approach to safeguarding and guardrails is evolving as we learn - if you have additional suggestions of things we should be thinking about, we'd welcome these suggestions.

huw

4w*

One thing which is unclear to me is, why aren’t these users counterfactually using free commercial offerings? Price is clearly not a barrier, is it just language? And why, then, wouldn’t a frontier lab be well-positioned to capture that market?

GiveDirectly

Good question. Digital literacy is fairly low and many people did not have smartphones until we ran this pilot. Language offerings are limited in free commercial offerings, particularly for voicenotes, which are essential for engagement in this group.

David T

It would be interested to see a more detailed and systematic report on the activity and findings so far.

In some respects, it seems like a strange thing for GiveDirectly to be piloting. On the one hand, GiveDirectly has expertise in systematic studies of behavioural change in LDCs , and the chatbot possibly also performed programmatic functions in a cost effective manner. On the other hand it involves a charity known for its "let local people decide how to use money spent on their behalf, Western aid agencies doing it can be disempowering and often wrong" ethos asking "which parameters should we use to fine tune this [adaptation of a commercial] product we've designed to give them the most suitable answers before scaling up its deployment"... which seems like a very different ethos and approach.^[1]

The conclusions highlighted from the research so far - both that if you give poor Rwandans access to ChatGPT they have a similar range of interaction to other humans^[2] and that responses generated by an LLM with no meaningful local training dataset were often inadequate - seem unsurprising. I am sympathetic to arguments that people make better decisions with access to information, but I am also sympathetic to arguments a ChatGPT derivative is not the most valuable information Rwandans could receive (and may have minimal or even negative value)

I'm not actually sure what the costs of acquiring relevant local data and training a chatbot to achieve greater fluency in spoken Kinyarwada dialects and safeguarding against advice that is very bad in a local context are,^[3] but they seem like a pretty relevant benchmark, since they might actually be considerable on a per user basis and the alternative for critical information like "what is the nearest health centre" might be something like signing people up to email lists, or a small number of human agents in Kigali costing surprisingly little.^[4] I guess there's also the "who's paying?" question, especially when the current implementation appears to involve providing training data for one of the world's most valuable companies (and obscure languages may or may not add value to their model).

I feel one relevant benchmark for GiveDirectly specifically might be "what is the estimated cost per per person reached to improve it: would locals rather have a better chatbot or the cash?". It's possible the insights they're getting are extremely valuable particularly in the context of limited/no of web access, but it's possible they're not...

^{^}
the relevant comparator might be the One Laptop Per Child project. Well intentioned, theory of change centred on the idea that people in LEDCs can be empowered by interacting with modern technology and better information too, but perhaps actual educational benefits didn't really stack up with the costs and the participants would have chosen to have something other than a computer
^{^}
I must admit, I am curious about the extent to which Rwandans engaged in "witty banter" or attempts to manipulate the chatbot into saying something silly...
^{^}
I don't know how bad the speaking and dataset is, and whether an adequate "solution" looks like a finetuning prompt with some info or developing a corpus of services data and synthetic idiosyncratic Kinyarwada to fix the model, but the latter option could be very expensive compared with the people it would actually reach...

^{^}

I suspect you get many person years of Rwandan human call centre time for a month or two of a mid-level AI engineer's time...

GiveDirectly

Hi David T., you are right to ask the question on whether people would rather have had the cash instead of a better chatbot. This is why we are starting with small pilots and using these to inform our next steps. So far the feedback has been very positive - people say that they wish they had had the chatbot earlier. Next we want to know if adding a chatbot boosts the impact of cash - if so, then rolling this out at larger scale will involve minimal costs but the benefits could be high. We don't know of other non-profits that are offering unrestricted versions of chatbots at this point (many focus only on agricultural advice) - but we believe this offers the kind of information access that should be available to people and allows people to make decisions based on their own needs.

NickLaing

And one quick query from the main article

"Shelton, Constance (CJ), and Latifah in Latifah’s farm where she now grows Irish potatoes and cabbage. She invested $300 in her farm, and has been able to increase her profits from $5/week to $30/week. She plans to expand into mushroom farming next."

Profits (not gross sales) of $30 a week subsistence farming in Rural Rwanda seem close to impossible. I imagine this is what she told you? Perhaps this was just during harvest season or something, then it would make more sense.

Perhaps not completely impossible though if she's unlocked a particular market!

GiveDirectly

Hi, yes this would be from self-reported data. We aren't currently collecting detailed M&E data on economic indicators during this phase of work.

Alex N.

I suggest you look at the work being done on multi-user agents and collective memory. Not as a replacement for individual AI access, not because the poor should somehow have their identities absorbed by the collective, but because multi-user agents could address exactly the local knowledge gap you are describing. Through sustained group interactions and self-learning the agent should be able to accumulate local knowledge and increase its usefulness (local markets, services, and conditions).

I have not been been able to find much of active discussion or experimentation on multi-user and collective memory in ICT4D context, focus is more on enterprise application, but that work is still relevant and maybe its development application is exactly where GiveDirectly will be well positioned to lead.

Basic implementation stack could be an openclaw or hermes style agent connected to a local WhatsApp or Telegram group, both have the functionality. WhatsApp and Telegram are already popular and widespread, so the technology can meet people where they already are.

GiveDirectly

Thanks Alex N.! We'll have a look into this.

AndrewBredenkamp

I've been involved in building AI in low-resource settings for many years. Happy to have a conversation.

Comments

More from the author

Money for nothing: the roles of evidence in GiveDirectly’s journey to $1 billion delivered

GiveDirectly·2mo ago·19m read

107

Study: Giving cash to mothers cut infant deaths in half

GiveDirectly·11mo ago·4m read

270

Offer an option to Muslim donors; grow effective giving

GiveDirectly, Muslim Impact Lab·3y ago·4m read

Curated and popular this week

What would an animal-aligned AI be aligned to?

Aidan Kankyoku, Anima International·2w ago·Curated 6d ago·15m read

This is a crosspost from the new Animal Welfare Alignment Newsletter by Anima International. You can subscribe on Substack if you are interested in following these efforts. Audio reading also available on Substack. The goals of this post are to: 1. Raise a question I see as crucially important to the goal of aligning AI to animal welfare...

137

Let's taboo the V-word

lincolnq·3d ago·8m read

“How long have you been v*g*n?” This is one of the most common icebreakers at animal protection events. It’s a baseline assumption, and it mostly holds true: if you’re out advocating for animals not to be tortured or abused, realistically these days you are v**n, or close. And it makes for good conversation. It seems fairly safe to assume when you meet strangers. But this assumption is hurting the movement in a way which we don’t always notice: someone new comes into the sp...

Spiro: an update 2.5 years on and a fundraising ask for expansion

Habiba Banu·20h ago·6m read

Summary Back in November 2023 I posted here to launch Spiro and raise our first $198k. Two and a half years later this is an update and a fundraiser for the next step. The short version: we've now reached over-5,900 people with TB preventive medicine, including over 3,000 children under five years old. Our early results have held up well an...

Recent opportunities to take action

David T

It would be interested to see a more detailed and systematic report on the activity and findings so far.

^{^}
the relevant comparator might be the One Laptop Per Child project. Well intentioned, theory of change centred on the idea that people in LEDCs can be empowered by interacting with modern technology and better information too, but perhaps actual educational benefits didn't really stack up with the costs and the participants would have chosen to have something other than a computer
^{^}
I must admit, I am curious about the extent to which Rwandans engaged in "witty banter" or attempts to manipulate the chatbot into saying something silly...
^{^}
I don't know how bad the speaking and dataset is, and whether an adequate "solution" looks like a finetuning prompt with some info or developing a corpus of services data and synthetic idiosyncratic Kinyarwada to fix the model, but the latter option could be very expensive compared with the people it would actually reach...

^{^}

I suspect you get many person years of Rwandan human call centre time for a month or two of a mid-level AI engineer's time...