(COI note: I work at OpenAI. These are my personal views, though.)
My quick take on the "AI pause debate", framed in terms of two scenarios for how the AI safety community might evolve over the coming years:
I think it would be helpful for you to mention and highlight your conflict-of-interest here.
I remember becoming much more positive about ads after starting work at Google. After I left, I slowly became more cynical about them again, and now I'm back down to ~2018 levels.
EDIT: I don't think this comment should get more than say 10-20 karma. I think it was a quick suggestion/correction that Richard ended up following, not too insightful or useful.
I appreciate you drawing attention to the downside risks of public advocacy, and I broadly agree that they exist, but I also think the (admittedly) exaggerated framings here are doing a lot of work (basically just intuition pumping, for better or worse). The argument would be just as strong in the opposite direction if we swap the valence and optimism/pessimism of the passages: what if, in scenario one, the AI safety community continues making incremental progress on specific topics in interpretability and scalable oversight but achieves too little too slowly and fails to avert the risk of unforeseen emergent capabilities in large models driven by race dynamics, or even worse, accelerates those dynamics by drawing more talent to capabilities work? Whereas in scenario two, what if the AI safety movement becomes similar to the environmental movement by using public advocacy to build coalitions among diverse interest groups, becoming a major focus of national legislation and international cooperation, moving hundreds of billions of $ into clean tech research, etc.
Don't get me wrong — there's a place for intuition pumps like this, and I use them often. But I also think that both techni... (read more)
Yepp, I agree that I am doing an intuition pump to convey my point. I think this is a reasonable approach to take because I actually think there's much more disagreement on vibes and culture than there is on substance (I too would like AI development to go more slowly). E.g. AI safety researchers paying for ChatGPT obviously brings in a negligible amount of money for OpenAI, and so when people think about that stuff the actual cognitive process is more like "what will my purchase signal and how will it influence norms?" But that's precisely the sort of thing that has an effect on AI safety culture independent of whether people agree or disagree on specific policies—can you imagine hacker culture developing amongst people who were boycotting computers? Hence why my takeaway at the end of the post is not "stop advocating for pauses" but rather "please consider how to have positive effects on community culture and epistemics, which might not happen by default".
I would be keen to hear more fleshed-out versions of the passages with the valences swapped! I like the one you've done; although I'd note that you're focusing on the outcomes achieved by those groups, whereas I'm focusing also ... (read more)
This kind of reads as saying that 1 would be good because it's fun (it's also kind of your job, right?) and 2 would be bad because it's depressing.
I don't think this is a coincidence—in general I think it's much easier for people to do great research and actually figure stuff out when they're viscerally interested in the problems they're tackling, and excited about the process of doing that work.
Like, all else equal, work being fun and invigorating is obviously a good thing? I'm open to people arguing that the benefits of creating a depressing environment are greater (even if just in the form of vignettes like I did above), e.g. because it spurs people to do better policy work. But falling into unsustainable depressing environments which cause harmful side effects seems like a common trap, so I'm pretty cautious about it.
"hesitate to pay for ChatGPT because it feels like they're contributing to the problem"
Yep that's me right now and I would hardly call myself a Luddite (maybe I am tho?)
Can you explain why you frame this as an obviously bad thing to do? Refusing to help fund the most cutting edge AI company, which has been credited by multiple people with spurring on the AI race and attracting billions of dollars to AI capabilities seems not-unreasonable at the very least, even if that approach does happen to be wrong.
Sure there are decent arguments against not paying for chat GPT, like the LLM not being dangerous in and of itself, and the small amount of money we pay not making a significant difference, but it doesn't seem to be prima-facie-obviously-net-bad-luddite behavior, which is what you seem to paint it as in the post.
Obviously if individual people want to use or not use a given product, that's their business. I'm calling it out not as a criticism of individuals, but in the context of setting the broader AI safety culture, for two broad reasons:
"show integrity with their lifestyles" is a nicer way of saying "virtue signalling",
I would describe it more as a spectrum. On the more pure "virtue signaling" end, you might choose one relatively unimportant thing like signing a petition, then blast it all over the internet while not doing other more important actions that's the cause.
Whereas on the other end of the spectrum, "showing integrity with lifestyle" to me means something like making a range of lifestyle choices which might make only s small difference to your cause, while making you feel like you are doing what you can on a personal level. You might not talk about these very much at all.
Obviously there are a lot of blurry lines in between.
Maybe my friends are different from yours, but climate activists I know often don't fly, don't drive and don't eat meat. And they don't talk about it much or "signal" this either. But when they are asked about it, they explain why. This means when they get challenged in the public sphere, both neutral people and their detractors lack personal ammunition to car dispersion on their arguments, so their position becomes more convincing.
I don't call that virtue signaling, but I suppose it's partly semantics.
One exchange that makes me feel particularly worried about Scenario 2 is this one here, which focuses on the concern that there's:
No rigorous basis for that the use of mechanistic interpretability would "open up possibilities" to long-term safety. And plenty of possibilities for corporate marketers – to chime in on mechint's hypothetical big breakthroughs. In practice, we may help AI labs again – accidentally – to safety-wash their AI products.
I would like to point to this as a central example of the type of thing I'm worried about in scenario 2: the sort of doom spiral where people end up actively opposed to the most productive lines of research we have, because they're conceiving of the problem as being arbitrarily hard. This feels very reminiscent of the environmentalists who oppose carbon capture or nuclear energy because it might make people feel better without solving the "real problem".
It looks like, on net, people disagree with my take in the original post. So I'd like to ask the people who disagree: do you have reasons to think that the sort of position I've quoted here won't become much more common as AI safety becomes much more activism-focused? Or do you think it would be good if it did?
history is full of cases where people dramatically underestimated the growth of scientific knowledge, and its ability to solve big problems.
There are 2 concurrent research programs, and if one program (capability) completes before the other one (alignment), we all die, but the capability program is an easier technical problem than the alignment program. Do you disagree with that framing? If not, then how does "research might proceed faster than we expect" give you hope rather than dread?
Also, I'm guessing you would oppose a worldwide ban starting today on all "experimental" AI research (i.e., all use of computing resources to run AIs) till the scholars of the world settle on how to keep an AI aligned through the transition to superintelligence. That's my guess, but please confirm. In your answer, please imagine that the ban is feasible and in fact can be effective ("leak-proof"?) enough to give the AI theorists all they time they need to settle on a plan even if that takes many decades. In other words, please indulge me this hypothetical question because I suspect it is a crux.
"Settled" here means that a majority of non-senile scholars / researchers who've worked full-time on th... (read more)
There are 2 concurrent research programs, and if one program (capability) completes before the other one (alignment), we all die, but the capability program is an easier technical problem than the alignment program. Do you disagree with that framing?
Yepp, I disagree on a bunch of counts.
a) I dislike the phrase "we all die", nobody has justifiable confidence high enough to make that claim, even if ASI is misaligned enough to seize power there's a pretty wide range of options for the future of humans, including some really good ones (just like there's a pretty wide range of options for the future of gorillas, if humans remain in charge).
b) Same for "the capability program is an easier technical problem than the alignment program". You don't know that; nobody knows that; Lord Kelvin/Einstein/Ehrlich/etc would all have said "X is an easier technical problem than flight/nuclear energy/feeding the world/etc" for a wide range of X, a few years before each of those actually happened.
c) The distinction between capabilities and alignment is a useful concept when choosing research on an individual level; but it's far from robust enough to be a good organizing principle on a societal level. Th... (read more)
AI Pause generally means a global, indefinite pause on frontier development. I'm not talking about a unilateral pause and I don't think any country would consider that feasible.
It currently seems likely to me that we're going to look back on the EA promotion of bednets as a major distraction from focusing on scientific and technological work against malaria, such as malaria vaccines and gene drives.
I don't know very much about the details of either. But it seems important to highlight how even very thoughtful people trying very hard to address a serious problem still almost always dramatically underrate the scale of technological progress.
I feel somewhat mournful about our failure on this front; and concerned about whether the same is happening in other areas, like animal welfare, climate change, and AI risk. (I may also be missing a bunch of context on what actually happened, though—please fill me in if so.)
I understand the sentiment, but there's a lot here I disagree with. I'll discuss mainly one.
In the case of global health, I disagree that"thoughtful people trying very hard to address a serious problem still almost always dramatically underrate the scale of technological progress."
This doesn't fit with the history of malaria and other infectious diseases where the opposite has happened, optimism about technological progress has often exceed reality.
About 60 years ago humanity was positive about eradicating malaria with technological progress. We had used (non-political) swamp draining and DDT spraying to massively reduce the global burden of malaria, wiping it out from countries like the USA and India. If you had done a prediction market in 1970, many malaria experts would have predicted we would have eradicated malaria by now - including potentially with vaccines, in fact it was a vibrant topic of conversation at the time, with many in the 60s believing a malaria vaccine would be here before now.
Again in 1979 after smallpox was eradicated, if you asked global health people how many human diseases we would eradicate by 2023, I'm sure the answer would have been higher th... (read more)
I don't think it makes sense to think of EA as a monolith which both promoted bednets and is enthusiastic about engaging with the kind of reasoning you're advocating here. My oversimplified model of the situation is more like:
(I think the EAs in the latter category have their own failure modes and wouldn't obviously have gotten the malaria thing right (assuming you're right that a mistake was made) if they had really tried to get it right, tbc.)
Thanks a lot that makes sense, this comment no longer stands after the edits so have retracted really appreciate the clarification!
(I'm not sure its intentional, but this comes across as patronizing to global health folks. Saying folks "don't want to do this kind of thinking" is both harsh and wrong. It seems like you suggest that "more thinking" automatically leads people down the path of "more important" things than global health, which is absurd.
Plenty of people have done plenty of thinking through an EA lens and decided that bed nets are a great place to spend lots of money which is great.
Plenty of people have done plenty of thinking through an EA lens and decided to focus on other things which is great.
One group might be right and the other might be wrong, but it is far from obvious or clear, and the differences of opinion certainly don't come from a lack of thought.
I think it helps to be kind and give folks the benefit of the doubt.)
I think you're right that my original comment was rude; I apologize. I edited my comment a bit.
I didn't mean to say that the global poverty EAs aren't interested in detailed thinking about how to do good; they definitely are, as demonstrated e.g. by GiveWell's meticulous reasoning. I've edited my comment to make it less sound like I'm saying that the global poverty EAs are dumb or uninterested in thinking.
But I do stand by the claim that you'll understand EA better if you think of "promote AMF" and "try to reduce AI x-risk" as results of two fairly different reasoning processes, rather than as results of the same reasoning process. Like, if you ask someone why they're promoting AMF rather than e.g. insect suffering prevention, the answer usually isn't "I thought really hard about insect suffering and decided that the math doesn't work out", it's "I decided to (at least substantially) reject the reasoning process which leads to seriously considering prioritizing insect suffering over bednets".
(Another example of this is the "curse of cryonics".)
I think this has been thought about a few times since EA started.
In 2015 Max Dalton wrote about medical research and said the below.
"GiveWell note that most funders of medical research more generally have large budgets, and claim that ‘It’s reasonable to ask how much value a new funder – even a relatively large one – can add in this context’. Whilst the field of tropical disease research is, as I argued above, more neglected, there are still a number of large foundations, and funding for several diseases is on the scale of hundreds of millions of dollars. Additionally, funding the development of a new drug may cost close to a billion dollars .
For these reasons, it is difficult to imagine a marginal dollar having any impact. However, as Macaskill argues at several points in Doing Good Better, this appears to only increase the riskiness of the donation, rather than reducing its expected impact.
In 2018 Peter Wildeford and Marcus A. Davis wrote about the cost effectiveness of vaccines and suggested that a malaria vaccine is competitive with other global health opportunities.
Do you think that if GiveWell hadn't recommended bednets/effective altruists hadn't endorsed bednets it would have led to more investment in vaccine development/gene drives etc.? That doesn't seem intuitive to me.
To me GiveWell fit a particular demand, which was for charitable donations that would have reliably high marginal impact. Or maybe to be more precise, for charitable donations recommended by an entity that made a good faith effort without obvious mistakes to find the highest reliable marginal impact donation. Scientific research does not have that structure since the outcomes are unpredictable.
I think I'd be more convinced if you backed your claim up with some numbers, even loose ones. Maybe I'm missing something, but imo there just aren't enough zeros for this to be a massive fuckup.
Fairly simple BOTEC:
Maybe I'm misunderstanding your point, but the two malaria vaccine that were recently approved (RTS,S and R21/Matrix M) are not mRNA vaccines. They're both protein-based.
I think part of my disagreement is I'm not sure what counts as "incremental." Like bednets are an intervention, that broadly speaking, can solve ~half the malaria problem forever at ~20-40 billion dollars, with substantial cobenefits. And attempts at "non-incremental" malaria solutions have already costed mid-high single digit billions. So it's not like the ratios are massively off. Importantly, "non-incremental" solutions like vaccines likely still requires fairly expensive development, distribution, and ongoing maintenance. So small mistakes might be there, but I don't see enough room left for us to be making large mistakes in the space.
That's what I mean by "not enough zeroes."
To be clear my argument is not insensitive to numbers. If the incremental solutions to the problem have a price tag of >1T (eg global poverty, or aging-related deaths), and non-incremental solutions have had a total price tag of <1B, then I'm much more sympathetic to the "the EV for trying to identify more scalable interventions is likely higher than incremental solutions now, even without looking at details"-style arguments.
I recently had a very interesting conversation about master morality and slave morality, inspired by the recent AstralCodexTen posts.
The position I eventually landed on was:
Empirically, it seems like the world is not improved the most by people whose primary motivation is helping others, but rather by people whose primary motivation is achieving something amazing. If this is true, that's a strong argument against slave morality.
This is seems very wrong to me on a historical basis. When I think of the individuals who have done the most good for the world, I think of people who made medical advances like the smallpox vaccine, scientists who discovered new technologies like electricity, and social movements like abolitionism that defeated a great and widespread harm. These people might want to "achieve something amazing", but they also have communitarian goals: to spread knowledge, help people or avert widespread suffering.
Also, it's super weird to take the Nietzschean master and slave morality framework at face value. it does not seem to be an accurate representation of the morality systems of people today.
I'm leaning towards the view that "don't follow your passion" and "try do really high-leverage intellectual work" are both good pieces of advice in isolation, but that they work badly in combination. I suspect that there are very few people doing world-class research who aren't deeply passionate about it, and also that EA needs world-class research in more fields than it may often seem.
What is the strongest argument, or the best existing analysis, that Givewell top charities actually do more good per dollar than good mainstream charities focusing on big-picture issues (e.g. a typical climate change charity, or the US Democratic party)?
If the answer is "no compelling case has been made", then does the typical person who hears about and donates to Givewell top charities via EA understand that?
If the case hasn't been made [edit: by which I mean, if the arguments that have been made are not compelling enough to justify the claims being made], and most donors don't understand that, then the way EAs talk about those charities is actively misleading, and we should apologise and try hard to fix that.
I think the strongest high-level argument for Givewell charities vs. most developed-world charity is the 100x multiplier.
That's a strong reason to suspect the best opportunities to improve the lives of current humanity lie in the developing world, but not decisive, and so usually analyses have been done, particularly of 'fan-favourite' causes like the ones you mention.
I'd also note that both the examples you gave are not what I would consider 'Mainstream charity'; both have prima facie plausible paths for high leverage (even if 100x feels a stretch), and if I had to guess right now my gut instinct is that both are in the top 25% for effectiveness. 'Mainstream charity' in my mind looks more like 'your local church', 'the arts', or 'your local homeless shelter'. Some quantified insight into what people in the UK actually give to here.
At any rate, climate-change has had a few of these analyses over the years, off the top of my head here's a recent one on the forum looking at the area in general, there's also an old and more specific analysis of Cool Earth by GWWC, which after running through a bunch of numbers concludes:
... (read more)Even with the most generous assumptions possible, this is s
After chatting with Alex Gordon-Brown, I updated significantly towards his position, which I've attempted to summarise below. Many thanks to him for taking the time to talk; I've done my best to accurately represent the conversation, but there may be mistakes. All of the following are conditional on focusing on near-term, human-centric charities.
Three key things I changed my mind on:
Hi Richard, I just wanted to say that I appreciate you asking these questions! Based on the number of upvotes you have received, other people might be wondering the same, and it's always useful to propagate knowledge like Alex has written up further.
I would have appreciated it even more if you had not directly jumped to accusing EA of being misleading (without any references) before waiting for any answers to your question.
Disproportionately many of the most agentic and entrepreneurial young EAs I know are community-builders. I think this is because a) EA community-building currently seems neglected compared to other cause areas, but b) there's currently no standard community-building career pathway, so to work on it they had to invent their own jobs.
Hopefully the people I'm talking about changing the latter will lead to the resolution of the former.
There's an old EA forum post called Effective Altruism is a question (not an ideology) by Helen Toner, which I think has been pretty influential.*
But I was recently thinking about how the post rings false for me personally. I know that many people in EA are strongly motivated by the idea of doing the most good. But I was personally first attracted to an underlying worldview composed of stories about humanity's origins, the rapid progress we've made, the potential for the world to be much better, and the power of individuals to contribute to that; from there, given potentially astronomical stakes, altruism is a natural corollary.
I think that leaders in EA organisations are more likely to belong to the former category, of people inspired by EA as a question. But as I discussed in this post, there can be a tradeoff between interest in EA itself versus interest in the things EA deems important. Personally I prioritise making others care about the worldview more than making them care about the question: caring about the question pushes you to do the right thing in the abstract, but caring about the worldview seems better at pushing you towards its most productive frontiers. This seems a... (read more)
In the same way that covid was a huge opportunity to highlight biorisk, the current Ukraine situation may be a huge opportunity to highlight nuclear risks and possible solutions to them. What would it look like for this to work really well?
The concept of cluelessness seems like it's pointing at something interesting (radical uncertainty about the future) but has largely been derailed by being interpreted in the context of formal epistemology. Whether or not we can technically "take the expected value" even under radical uncertainty is both a confused question (human cognition doesn't fit any of these formalisms!), and also much less interesting than the question of how to escape from radical uncertainty. In order to address the latter, I'd love to see more work that starts from Bostrom's framing in terms of crucial considerations.
One use case of the EA forum which we may not be focusing on enough:
There are some very influential people who are aware of and somewhat interested in EA. Suppose one of those people checks in on the EA forum every couple of months. Would they be able to find content which is interesting, relevant, and causes them to have a higher opinion of EA? Or if not, what other mechanisms might promote the best EA content to their attention?
The "Forum Favourites" partly plays this role, I guess. Although because it's forum regulars who are most likely to highly upvote posts, I wonder whether there's some divergence between what's most valuable for them and what's most valuable for infrequent browsers.
There was a lot of discussion in the early days of EA about replacement effects in jobs, and also about giving now vs giving later (for a taste of how much, see my list here, and Julia Weiss' disjoint list here).
The latter debate is still fairly prominent now. But I think that arguments about replacement effects became largely redundant when we started considering the value of becoming excellent in high-leverage domains like altruistically-focused research (for which the number of jobs isn't fixed like it is in, say, medicine).
One claim that I haven't seen... (read more)
.