I'm currently facing a career choice between a role working on AI safety directly and a role at 80,000 Hours. I don't want to go into the details too much publicly, but one really key component is how to think about the basic leverage argument in favour of 80k. This is the claim that's like: well, in fact I heard about the AIS job from 80k. If I ensure even two (additional) people hear about AIS jobs by working at 80k, isn't it possible going to 80k could be even better for AIS than doing the job could be?
In that form, the argument is naive and implausible. But I don't think I know what the "sophisticated" argument that replaces it is. Here are some thoughts:
I think that this:
> but the intuition that calls this model naive is driven by a sense that it's going to turn out to not "actually" be 2 additional people, that additionality is going to be lower than you think, that the costs of getting that result are higher than you think, etc. etc.
is most of the answer. Getting a fully counterfactual career shift (that person's expected career value without your intervention is ~0, but instead they're now going to work at [job you would otherwise have taken, for at least as long as you would have]) is a really high bar to meet. If you did expect to get 2 of those, at equal skill levels to you, then I think the argument for 'going meta' basically goes through.
In practice, though:
- People who fill [valuable role] after your intervention probably had a significant chance of finding out about it anyway.
- They also probably had a significant chance of ending up in a different high-value role had they not taken the one you intervened on.
How much of a discount you want to apply for these things is going depend a lot on how efficiently you expect the [AI safety] job market to allocate talent. In general, I find it easier to arrive at reasonable-see... (read more)
AI Safety Needs To Get Serious About Chinese Political Culture
I worry that Leopold Aschenbrenner's "China will use AI to install a global dystopia" take is based on crudely analogising the CCP to the USSR, or perhaps even to American cultural imperialism / expansionism, and isn't based on an even superficially informed analysis of either how China is currently actually thinking about AI, or what China's long term political goals or values are.
I'm no more of an expert myself, but my impression is that China is much more interested in its own national security interests and its own ideological notions of the ethnic Chinese people and Chinese territory, so that beyond e.g. Taiwan there isn't an interest in global domination except to the extent that it prevents them being threatened by other expansionist powers.
This or a number of other heuristics / judgements / perspectives could change substantially how we think about whether China would race for AGI, and/or be receptive to an argument that AGI development is dangerous and should be suppressed. China clearly has a lot to gain from harnessing AGI, but they have a lot to lose too, just like the West.
Currently, this is a pretty superfi... (read more)
I recommend the China sections of this recent CNAS report as a starting point for discussion (it's definitely from a relatively hawkish perspective, and I don't think of myself as having enough expertise to endorse it, but I did move in this direction after reading).
From the executive summary:
Taken together, perhaps the most underappreciated feature of emerging catastrophic AI risks from this exploration is the outsized likelihood of AI catastrophes originating from China. There, a combination of the Chinese Communist Party’s efforts to accelerate AI development, its track record of authoritarian crisis mismanagement, and its censorship of information on accidents all make catastrophic risks related to AI more acute.
From the "Deficient Safety Cultures" section:
... (read more)While such an analysis is of relevance in a range of industry- and application-specific cultures, China’s AI sector is particularly worthy of attention and uniquely predisposed to exacerbate catastrophic AI risks [footnote]. China’s funding incentives around scientific and technological advancement generally lend themselves to risky approaches to new technologies, and AI leaders in China have long prided themselves on t
I have a broad sense that AI safety thinking has evolved a bunch over the years, and I think it would be cool to have a retrospective of "here are some concrete things that used to be pretty central that we now think are either incorrect or at least incorrectly focused"
Of course it's hard enough to get a broad overview of what everyone thinks now, let alone what they used to think but discarded.
(this is probably also useful outside of AI safety, but I think it would be most useful there)
I wonder how the recent turn for the worse at OpenAI should make us feel about e.g. Anthropic and Conjecture and other organizations with a similar structure, or whether we should change our behaviour towards those orgs.
On (1), these issues seem to be structural in nature, but exploited by idiosyncrasies. In theory, both OpenAI's non-profit board & Anthropic's LTBT should perform the roughly same oversight function. In reality, a combination of Sam's rebellion, Microsoft's financial domination, and the collective power of the workers shifted the decision to being about whether OpenAI would continue independently with a new board or re-form under Microsoft. Anthropic is just as susceptible to this kind of coup (led by Amazon), but only if their leadership and their workers collectively want it, which, in all fairness, I think they're a lot less likely to.
But in some sense, no corporate structure can protect against all of the key employees organising to direct their productivity somewhere else. Only a state-backed legal structure really has that power. If you're worried about some bad outcome, I think you either have to trust that the Anthropic people have good intentions and won't sell themselves to Amazon, or advocate for legal restrictions on AI work.
something I persistently struggle with is that it's near-impossible to know everything that has been said about a topic, and that makes it really hard to know when an additional contribution is adding something or just repeating what's already been said, or worse, repeating things that have already been refuted
to an extent this seems inevitable and I just have to do my best and sometimes live with having contributed more noise than signal in a particular case, but I feel like I have an internal tuning knob for "say more" vs. "listen more" and I find it really hard to know which direction is overall best
As weird as it sounds, I think the downvote button should make you a bit less concerned with contribution quality. If it's obviously bad, people will downvote and read it less. If it's wrong without being obviously bad, then others likely share the same misconception, and hopefully someone steps in to correct it.
In practice, the failure mode for the forum seems to be devoting too much attention to topics that don't deserve it. If your topic deserves more attention, I wouldn't worry a ton about accidentally repeating known info? For one thing, it could be valuable spaced repetition. For another, discussions over time can help turn something over and look at it from various angles. So I suppose the main risk is making subject matter experts bored?
In some sense you could consider the signal/noise question separate from the epistemic hygiene question. If you express uncertainty properly, then in theory, you can avoid harming collective epistemics even for a topic you know very little about.
On the current margin, I actually suspect EAs should be deferring less and asking dumb questions more. Specific example: In a world where EA was more willing to entertain dumb questions, perhaps we could've discovered AI Pause without Katja Grace having to write a megapost. We don't want to create "emperor has no clothes" type situations. Right now, "EA is a cult" seems to be a more common outsider critique than "EAs are ignorant and uneducated".
People often propose HR departments as antidotes to some of the harm that's done by inappropriate working practices in EA. The usual response is that small organisations often have quite informal HR arrangements even outside of EA, which does seem kinda true.
Another response is that it sometimes seems like people have an overly rosy picture of HR departments. If your corporate culture sucks then your HR department will defend and uphold your sucky corporate culture. Abusive employers will use their HR departments as an instrument of their abuse.
Perhaps the idea is to bring more mainstream HR practices or expertise into EA employers, rather than merely going through the motions of creating the department. But I think mainstream HR comes primarily from the private sector and is primarily about protecting the employer, often against the employee. They often cast themselves in a role of being there to help you, but a common piece of folk wisdom is "HR is not your friend". I think frankly that a lot of mainstream HR culture is at worst dishonest and manipulative, and I'd be really sad to see us uncritically importing more of that.
For Pause AI or Stop AI to succeed, pausing / stopping needs to be a viable solution. I think some AI capabilities people who believe in existential risk may (perhaps?) be motivated by the thought that the risk of civilisational collapse is high without AI, so it's worth taking the risk of misaligned AI to prevent that outcome.
If this really is cruxy for some people, it's possible this doesn't get noticed because people take it as a background assumption and don't tend to discuss it directly, so they don't realize how much they disagree and how crucial that disagreement is.
credit to AGB for (in this comment) reminding me where to find the Scott Alexander remarks that pushed me a lot in this direction:
... (read more)Second, if we never get AI, I expect the future to be short and grim. Most likely we kill ourselves with synthetic biology. If not, some combination of technological and economic stagnation, rising totalitarianism + illiberalism + mobocracy, fertility collapse and dysgenics will impoverish the world and accelerate its decaying institutional quality. I don’t spend much time worrying about any of these, because I think they’ll take a few generations to reach crisis level, and I expect technology to flip the gameboard well before then. But if we ban all gameboard-flipping technologies (the only other one I know is genetic enhancement, which is even more bannable), then we do end up with bioweapon catastrophe or social collapse. I’ve said before I think there’s a ~20% chance of AI destroying the world. But if we don’t get AI, I think there’s a 50%+ chance in the next 100 years we end up dead or careening towards Venezuela. That doesn’t mean I have to support AI accelerationism because 20% is smaller than 50%. Short, carefully-tailored pauses could improve
Gathering some notes on private COVID vaccine availability in the UK.
News coverage:
It sounds like there's been a licensing change allowing provision of the vaccine outside the NHS as of March 2024 (ish). Pharmadoctor is a company that supplies pharmacies and has been putting about the word that they'll soon be able to supply them with vaccine doses for private sale -- most media coverage I found names them specifically. However, the pharmacies themselves are responsible for setting the price and managing bookings or whatever. All Pharmadoctor does for the end user is tell you which pharmacies they are supplying and give you the following pricing guidance:
Comirnaty Omicron XBB.1.5 (Pfizer/BioNTech) £75-£85
Nuvaxovid XBB.1.5 (Novavax) £45-£55 (update: estimated availability from w/c 22/04/2024)
Some places offering bookings:
Dustin Moskovitz claims "Tesla has committed consumer fraud on a massive scale", and "people are going to jail at the end"
Not super EA relevant, but I guess relevant inasmuch as Moskovitz funds us and Musk has in the past too. I think if this were just some random commentator I wouldn't take it seriously at all, but a bit more inclined to believe Dustin will take some concrete action. Not sure I've read everything he's said about it, I'm not used to how Threads works
Something I'm trying to do in my comments recently is "hedge only once"; e.g. instead of "I think X seems like it's Y", you pick either one of "I think X is Y" or "X seems like it's Y". There is a difference in meaning, but often one of the latter feels sufficient to convey what I wanted to say anyway.
This is part of a broader sense I have that hedging serves an important purpose but is also obstructive to good writing, especially concision, and the fact that it's a particular feature of EA/rat writing can be alienating to other audiences, even though I think it comes from a self-awareness / self-critical instinct that I think is a positive feature of the community.
I've been reviewing some old Forum posts for an upcoming post I'm writing, and incidentally came across this by Howie Lempel for noticing in what spirit you're engaging with someone's ideas:
"Did I ask this question because I think they will have a good answer or because I think they will not have a good answer?"
I felt pretty called out :P
To be fair, I think the latter is sometimes a reasonable persuasive tactic, and it's fine to put yourself in a teaching role rather than a learning role if that's your endorsed intention and the other party is on board. But the value of this quote to me is that it successfully highlights how easily we can tell ourselves we're being intellectually curious, when we're actually doing something else.
unfortunately when you are inspired by everyone else's April Fool's posts, it is already too late to post your own
I will comfort myself by posting my unseasonal ideas as comments on this post
I've been working in software development and management for about 10 years, but I'm currently on a break while I unwind a little and try some directions out before immersing myself in full time work again. I'm open to people using my technical skills:
I think at previous EAGs I always had the sense that I had a "budget" of 1-on-1s I could schedule before I'd be too exhausted. I'd often feel very tired towards the end of the second day, which I took as validation that I indeed needed to moderate.
This EAG, I:
I think it's very possible this is a coincidence, that this is because of other ways I've happened to ... (read more)
Ideas of posts I could write in comments. Agreevote with things I should write. Don't upvote them unless you think I should have karma just for having the idea, instead upvote the post when I write it :P
Feel encouraged also to comment with prior art in cases where someone's already written about something. Feel free also to write (your version of) one of these posts, but give me a heads-up to avoid duplication :)
(some comments are upvoted because I wrote this thread before we had agreevotes on every comment; I was previously removing my own upvotes on these but then I learned that your own upvotes don't affect your karma score)
Edit: This is now The illusion of consensus about EA celebrities
Something to try to dispel the notion that every EA thinker is respected/ thought highly of by every EA community member. Like, you tend to hear strong positive feedback, weak positive feedback, and strong negative feedback, but weak negative feedback is kind of awkward and only comes out sometimes
something about the role of emotions in rationality and why the implicit / perceived Forum norm against emotions is unhelpful, or at least not precisely aimed
(there's a lot of nuance here, I'll put it in dw)
edit: I feel like the "notice your confusion" meme is arguably an example of emotional responses providing rational value.
The convention in a lot of public writing is to mirror the style of writing for profit, optimized for attention. In a co-operative environment, you instead want to optimize to convey your point quickly, to only the people who benefit from hearing it. We should identify ways in which these goals conflict; the most valuable pieces might look different from what we think of when we think of successful writing.
Writing to persuade might still be best done discursively, but if you anticipate your audience already being sold on the value of your information, just present the information as you would if you were presenting it to a colleague on a project you're both working on.
Though betting money is a useful way to make epistemics concrete, sometimes it introduces considerations that tease apart the bet from the outcome and probabilities you actually wanted to discuss. Here's some circumstances when it can be a lot more difficult to get the outcomes you want from a bet:
As an example, I saw someone claim that the US was facing civil war. Someone else thought this was extremely unlikely, and offered to bet on it. You can't make bets on this! The value of the payout varies wildly depending on the exact scenario (are dollars lifesaving or worthless?), and more to the point the last thing on anyone's minds will be internet bets with strangers.
In general, you can't make bets about major catastrophes (leaving aside the question of whether you'd want to), and even with non-catastrophic geopolitical events, the bet you're making may not be the one you intended to make, if the value of money depends on the result.
A related idea is that you can't sell (or buy) insurance against scenarios in which insurance contracts don't pay out, including most civilizational catastrophes, which can make it harder to use traditional market methods to capture the potential gains from (say) averting nuclear war. (Not impossible, but harder!)
Even if it's legal, some people may think it's unethical to lobby against an industry that you've shorted.
It could provide that industry with an argument to undermine the arguments against them. They might claim that their critics have ulterior motives.
I'm going to make a quick take thread of EA-relevant software projects I could work on. Agree / disagree vote if you think I should/ should not do some particular project.
[edit: this is now https://forum.effectivealtruism.org/posts/gxmfAbwksBpnwMG8m/can-the-ai-afford-to-wait]
People talk about AI resisting correction because successful goal-seekers "should" resist their goals being changed. I wonder if this also acts as an incentive for AI to attempt takeover as soon as it's powerful enough to have a chance of success, instead of (as many people fear) waiting until it's powerful enough to guarantee it.
Hopefully the first AI powerful enough to potentially figure out that it wants to seize power and has a chance of succeeding ... (read more)
Fair, I'm grumpy about Leopold's position but my above comment wasn't careful to target the real problems and doesn't give a good general rule here.