Quick takes

Set topic
Frontpage
Global health
Animal welfare
Existential risk
Biosecurity & pandemics
12 more

I had a question. Why do all the AI safety companies seem to do the opposite of AI safety? Anthropic keeps publicly releasing models (which means they can be accessed by billions of people), same for OpenAI, and while these models are unlikely to cause major problems, if you're releasing a product that is going to be used by billions of people you should make sure the product is around 99.9999% failure proof. Anthropic themselves have said "AI models have reached a level of coding capability where they can surpass all but the most skilled humans at finding... (read more)

A recurring sub-theme across multiple of my research interests this year have been various forms of deception checking, particularly automated deception checking.

I've gotten pretty disappointed in the space. Not all the time (eg Pangram is great), but consistently they can be bad, and bad in ways that are not obvious to outsiders or low-information buyers.

If you're a deception checking company, there's a consistent tradeoff for what you can invest your resources in:

  1. You can invest in better deception checking
  2. You can invest in better deception. Specifically,
... (read more)

My first Fable benchmark was to one-shot turning `emacs -batch -l dunnet` into a graphical adventure game and it hit the safety guardrails bc one of the puzzles involve nitroglycerine 😭

Invitation for bets

I’m willing to bet that Anthropic’s revenue growth over the next year will be slower than its revenue growth over the last 3 years. I proposed a specific bet here. Anyone who wants can offer to take the other side of that bet. Or you can make a counteroffer.

I’m also willing to make a longer-term bet that the AI industry is in a bubble. I proposed a specific bet for that, too, here. Feel free to offer to take the other side of that bet or make a counteroffer.

I’d also be open to other bets. It seems pointless to bet about whether AGI or tr... (read more)

Showing 3 of 7 replies (Click to show all)

I often hear the suggestion that people should short stock when they don't believe in a company. I don't think that it is a very good piece of advice.

Shorting is notoriously difficult and carries the possibility of unlimited loss. Even if you believe that a stock will crash, small errors such as timing the crash one year too early or using a miscalculated stop order to stop shorting too early can lead to massive losses. Determining the actual risk involved is not very straightforward. Shorting is often accompanied by risk-managing tactics such as hedging t... (read more)

4
Noah Birnbaum
Post this on LW and youll prob get more offers 
4
Yarrow Bouchard 🔸
Feel free to cross-post it! You have my permission! (My LessWrong account has been deactivated for years and I’m not going to reactivate it for this.)

Lots of people in the Bay seem to be thinking about/preparing for/making funding decisions based on the idea that lots of philanthropy will be given to AIS/EA cause areas very soon (i.e. end of year-ish). I would love for someone to write the comprehensive steel man case against this, as I think it’s probably underrated (some reasons to think they won’t give the money/it won’t be as much as some assume. Happy to comment/ speak to whoever is interested in doing this.  

8
NickLaing
This is a great idea, although this would be a big call for many of us to do who might hope that our org could get some of that funding. Even someone like myself who fires shots pretty freely might think twice. We wouldn't want to rule our organisation out by writing too strong a Steelman against this. Any steelman would involve arguing that previous promises from founders and employees don't carry much weight, which could be seen as a personal slight against someone who could conceivably give you money. Sometimes I wonder if I've already said too much! @Marcus Abramovitch 🔸has written a bit on this he might have a comment here?

I have so much intended writing on my plate and, unfortunately, little time to do it.

In a nutshell, I think some employees are dedicated EAs who will give a lot of money to causes I deeply care about. Some of them have a binding agreement to donate a portion of their stock (and a match).

I don't trust the founders. There is no legal mechanism I know of that binds them. The base rate of people effectively giving away large sums of wealth, even when they said they would, is shockingly low. They have said nothing EA-related in years apart from distancing thems... (read more)

I find it icky/disappointing when things like this end up on the 80k job board. Can someone make a case why this is a high-impact job worth advertising on the job board? 

 

Showing 3 of 4 replies (Click to show all)
6
CBiddulph
I don't think it's icky. (Some might even say it would be more icky to only value fancy research roles?) But it is somewhat surprising to me that this role ended up on the job board, as I would've assumed that Constellation sources this kind of role via normal job boards, like Indeed or something. I wonder how many blue-collar workers at Constellation found their role due to EA motivations. My impression is that this number is very low, although I did hear that the chef is EA-motivated. It seems like it would be quite nice for "low-skill" people who are worried about AI to be able to contribute. And plausibly a janitor or dishwasher who feels a great sense of purpose in their work would have noticeably more impact. But I feel like EA appeals mainly to "elites," for better or for worse...
33
NickLaing
I couldn't disagree more here, I think we need more of these kind of jobs on the jobboard. Imagine if you are a kitchenhand and you are good at and like your job. Perhaps you don't have any tertiary education but you'd love to be more impactful then your current work at Starbucks.  What a rare and great opportunity.

Can you explain how it would be more impactful? I understand impact as meaning counterfactual impact, so I can only imagine this being the case if it's hard to hire a dishwasher for $21-25 an hour in San Francisco. 

Hey everyone, my name is Jacques, I'm an independent technical alignment researcher (primarily focused on evaluations, interpretability, and scalable oversight). I'm now focusing more of my attention on building an Alignment Research Assistant. I'm looking for people who would like to contribute to the project. This project will be private unless I say otherwise.

Side note: I helped build the Alignment Research Dataset ~2 years ago. It has been used at OpenAI (by someone on the alignment team), (as far as I know) at Anthropic for evals, and is now used as t... (read more)

Showing 3 of 6 replies (Click to show all)

Hi Jacque,  what sort of mechanisms have you researched so far that may be effective? 

Interested to hear your conclusions so far, because AI Alignment is often spoke of by way of external legislative enforcement, corporate change, but rarely addressed at the fundamental base (mathematical layer in which reasoning is tracked), as many of the well-known models hire employees to work on linear regression and some KNN and then let them go once their job is done. 

7
jacquesthibs
As an update to the Alignment Research Assistant I'm building, here is a set of shovel-ready tasks I would like people to contribute to (please DM if you'd like to contribute!): Core Features 1. Setup the Continue extension for research: https://www.continue.dev/  * Design prompts in Continue that are suitable for a variety of alignment research tasks and make it easy to switch between these prompts * Figure out how to scaffold LLMs with Continue (instead of just prompting one LLM with additional context) * Can include agents, search, and more * Test out models to quickly help with paper-writing 2. Data sourcing and management * Integrate with the Alignment Research Dataset (pulling from either the SQL database or Pinecone vector database): https://github.com/StampyAI/alignment-research-dataset  * Integrate with other apps (Google Docs, Obsidian, Roam Research, Twitter, LessWrong) * Make it easy to look and edit long prompts for project context 3. Extract answers to questions across multiple papers/posts (feeds into Continue) * Develop high-quality chunking and scaffolding techniques * Implement multi-step interaction between researcher and LLM 4. Design Autoprompts for alignment research * Creates lengthy, high-quality prompts for researchers that get better responses from LLMs 5. Simulated Paper Reviewer * Fine-tune or prompt LLM to behave like an academic reviewer * Use OpenReview data for training 6. Jargon and Prerequisite Explainer * Design a sidebar feature to extract and explain important jargon * Could maybe integrate with some interface similar to https://delve.a9.io/  7. Setup automated "suggestion-LLM" * An LLM periodically looks through the project you are working on and tries to suggest *actually useful* things in the side-chat. It will be a delicate balance to make sure not to share too much and cause a loss of focus. This could be custom for the research with an option only to give automated suggestions post-research
5
jacquesthibs
We're doing a hackathon with Apart Research on 26th. I created a list of problem statements for people to brainstorm off of. Pro-active insight extraction from new research Reading papers can take a long time and is often not worthwhile. As a result, researchers might read too many papers or almost none. However, there are still valuable nuggets in papers and posts. The issue is finding them. So, how might we design an AI research assistant that proactively looks at new papers (and old) and shares valuable information with researchers in a naturally consumable way? Part of this work involves presenting individual research with what they would personally find valuable and not overwhelm them with things they are less interested in. How can we improve the LLM experience for researchers? Many alignment researchers will use language models much less than they would like to because they don't know how to prompt the models, it takes time to create a valuable prompt, the model doesn't have enough context for their project, the model is not up-to-date on the latest techniques, etc. How might we make LLMs more useful for researchers by relieving them of those bottlenecks? Simple experiments can be done quickly, but turning it into a full project can take a lot of time  One key bottleneck for alignment research is transitioning from an initial 24-hour simple experiment in a notebook to a set of complete experiments tested with different models, datasets, interventions, etc. How can we help researchers move through that second research phase much faster? How might we use AI agents to automate alignment research? As AI agents become more capable, we can use them to automate parts of alignment research. The paper "A Multimodal Automated Interpretability Agent" serves as an initial attempt at this. How might we use AI agents to help either speed up alignment research or unlock paths that were previously inaccessible? How can we nudge research toward better objectives (age

Me and @Fran are co-hosting a podcast, The World Can Be Better! (hosted on Substack, also on Spotify, Apple Podcasts, YouTube)

In our first co-hosted episode we interview @finm about better futures, the intelligence explosion, and Fin's underrated post 'No ghost in the machine'. 

We'll be recording more episodes soon. We're aiming to a) keep it accessible to an interested but non-specialist audience, and b) to talk to cool people who are working on making the world better. This is on top of full time jobs for both of us, so we're not promising weekly up... (read more)

2
James Herbert
Oh perfect, I’ve been looking for a podcast that fills exactly this niche :)  I listened to your most recent episode and really enjoyed it, thanks for producing it! 

Thanks James, that's really cool to hear :)

Book Review: Never Let Me Go by Kazuo Ishiguro

TLDR: 

(spoilers follow, you can also read here)

This book is the equivalent of seeing someone wear a lovely, thick, warm, knit sweater on a sunny beach. The sweater plot is so nice, but I keep thinking it would really be more at home in a different location genre.

bffr

This book makes no sense as a sci-fi(-ish) novel. For context, it is about a group of clones, created so they can eventually become organ donors. Unfortunately, this premise falls apart if you think about it for even a minute.

How were... (read more)

Linch
22
3
0
3

I sometimes hear complaints from non-native English speakers about how banning undisclosed LLM use in writing is unfair.

Possible pro-tip for non-native English speakers who want to write well but don't want to sound like AI: Just write an article you want to write in your native language, polish it until you're proud of it in your native language, and then ask a frontier LLM (Opus 4.8, Gemini 3.1 Pro, ChatGPT 5.5 Pro) to translate it to English, while reasonably adhering to your original intent and writing.

In my experience and tests, the LLMs are sufficien... (read more)

Showing 3 of 4 replies (Click to show all)
3
Linch
10 years ago I'd have wholeheartedly agreed but these days AI translation is good enough that I wouldn't recommend people bothering, especially for professional reasons.
2
david_reinstein
I’ve been saying this would happen for years and I’ve maybe been wrong for years but maybe we are finally there

I don't think we're quite there yet but I think we're close enough, and language learning probably not the most valuable for professional reasons (some people might still want to do it for social reasons or literature-appreciation reasons)

Did any of the boosters of real-money prediction markets correctly predict that prediction market platforms would be quickly dominated by thinly disguised sports gambling?

(I mean this question literally and earnestly, not as a snide takedown of prediction markets or their proponents)

Showing 3 of 13 replies (Click to show all)
1
Pat Myron 🔸
It also expanded access to teenagers while most states that allowed it had restricted it to 21+

Wow good point.

2
Nathan Young
I am pretty sure I thought this, yes. That's how it is in the UK. And all prediction markets push in this direction. I thought that the benefits would outweigh the costs, but I am less confident of that now. (Though I think the benfits are huge, really really large) I weakly support regulation of huge sports gambling losses which seems very possible to do. 

Why isn’t there a shrine to the Unknown Donor at Roland-Garros?

Shot: “Rafa gave absolutely everything. He gave us everything he had.”

Chaser: “I gave my life to one of the most individual sports that exists.”

Rafa Nadal isn’t the GOAT, and neither are Federer or Djokovic.

Suggestion: Leverage Research deep dive

Someone (other than me) should write a deep-dive post about the cult Leverage Research and its infiltration of effective altruism.

The story, in brief:

  • Leverage Research is a cult.
  • Leverage Research organized the first EA Summit in 2013 and the second EA Summit in 2014. The EA Summits were the first effective altruism conferences of any kind.
  • Leverage Research also helped to organize the first EA Global conferences, which began in 2015 and continue to this day.
  • In 2016, a major EA program, the Pareto Fellowship, was run la
... (read more)
Showing 3 of 6 replies (Click to show all)

I think it’s worth noting that Larissa and Kerry have denied being involved with Leverage until after they departed CEA.

There is a thread here where Kerry (now deleted) makes claims on his side of this story.

4
Jonathan Mannhart
Yeah, probably just slightly disagree with the word “takeover“ in Oliver‘s comment to some extent, but that seems like a reasonable linguistic disagreement. (If it’s not taken over for a significant amount of time, because then the other people kicked you out, it wasn’t much of a takeover. Maybe Oliver and me would arrive at “long/mid-term-unsuccessful-takeover“ as the concept we‘d both agree on. Also acknowledging that I wasn’t there at the time, and he was.) Doesn’t change the fundamental point that it seems important to have some transparent documentation on this. Seems good.
3
Yarrow Bouchard 🔸
Calling it a “year-long takeover” would resolve the ambiguity.

An AI that is to us as we are to other species does not go well for us. It needs to have better values!

Hi Community, here's a quick take I have been thinking for a while:

Animal Sentience

The question of whether animals are sentient or conscious remains controversial, largely because it is an epistemological challenge: we cannot directly access the subjective experiences of other beings. The term consciousness is particularly loaded, carrying strong anthropocentric assumptions that often limit meaningful discussion outside the human context. In contrast, sentience provides a more useful and flexible framework, as it allows for different forms and degrees of s... (read more)

Snopes did pretty detailed secondary reporting on my analysis of AI use in the recent encyclical. 

I think it's pretty good. Covers some stuff I didn't include in my original analysis, and their conclusion was similar to mine, maybe slightly less strong.

Less technical than my post, and imo not as funny, but also significantly shorter (1600 words), includes some replications and also added some details I didn't know as of time of writing.

Overall a good piece, potentially worth reading/skimming either in addition to or instead of my original analysis.

The origins of the rumor

On May 26, tech news website The Verge published a brief article with the headline, "Did the Pope use AI to write about the dangers of AI?" The Verge's X post (archived) promoting its article received millions of views.

It annoys me that they cite the Verge article as the "origin", rather than your article that the Verge article was based on.

One reflection I've had in the whole "AI use in the encyclical" affair is to slightly increase my trust in traditional media, especially non-American traditional media, and slightly decrease my trust in social media/new media.

I tried my best to promote my analysis as legibly and reasonably as I could and focused on logos rather than ethos: I didn't frame my article with institutional affiliations and intentionally chose not to include obvious, flashy, but irrelevant signaling. Stuff I could've done but explicitly chose not to: get an ML professor to cosign... (read more)

Showing 3 of 7 replies (Click to show all)

Small update: Life Site News covered it in nontrivial detail. Moderately faithful. Never heard of them before but Wikipedia classifies it as a "far-right pro-life Catholic publication," so  I'd count it as closer to the "alt-media" side of the "traditional media <> alt-media" spectrum than Russia Today, which was previously the most "alt-media" I've seen of the big coverage so far.

2
Guy Raveh
Being featured on Snopes is sort of a major achievement IMO :)
2
Linch
And positive lean too! As opposed to a takedown haha.

I started research into farmed animal welfare in Muslim countries and I think this is a useful way to share little updates along the way, and also to track any ideas I come up with so I can refer back to them when I need to compile my findings. Because I'm also working on a grant looking into effective Zakat, and I think I'll end up doing the same thing for that, I'm going to be numbering farmed animal welfare quick takes with FAW# and Effective Zakat quick takes with EZ#.

so.

FAW#1.

Before starting with this project, I was operating under the assumption that... (read more)

6
Michael St Jules 🔸
Any updates on this, or promising interventions?
15
Kaleem
Hi Michael. Since writing this I finished the paper for OP which I can share with you if you'd like to read it. I'd say that the research found almost no existing FAW interventions in Muslim countries that leverage Islamic principles: the handful that exist (cage-free campaigns in Turkey, the Gulf, Indonesia, Malaysia) are mostly EA/welfare driven and don't really engage with Islamic theology. The most promising intervention identified is working within the halaal certification ecosystem, since there are ~400 certification bodies globally with huge variation in standards and significant profit incentives that could be redirected toward welfare. Layer hen welfare was flagged as the most immediately tractable target. Longer term, getting ahead of lab-grown meat's halaal status could be valuable. Overall the biggest surprise was how little information there was about industries or consumer attitudes in this space. Afterwards I ran a n~6000 person survey of Muslims from 15 countries to fill in some gaps regarding our understanding of what Muslims think about industrial agriculture and slaughter in relation to their religious and moral beliefs - that survey is finished and I'm working on sharing the results publicly within the next month or so (as well as open-sourcing the dataset).

Ah, influencing certifiers sounds interesting, would love to take a look at the paper! :)

Also looking forward to your survey when it's ready!

I've been curious lately about Muslims' attitudes towards stunning and slaughter practices, and especially what kinds of stunning would be acceptable, in case we wanted to promote more stunning or even do more R&D for halal stunning.

Lots of EA orgs say they struggle to hire for ops, marketing and comms. But when they post listings for these roles the salaries are often much lower than what they offer for research and engineering, which they generally find easier to fill. My guess is that orgs are just benchmarking against normal market rates for these roles. But the EA labour market is very different to the normal labour market, if these roles are undersupplied inside EA, I think orgs should be willing to pay more for them.

Showing 3 of 6 replies (Click to show all)
2
Lorenzo Buonanno🔸
I would interpret all three as signals that orgs find it harder to fill research roles, right?
1
benrmatthews
@Oscar Sykes Can I check for a source/reference to hire for lots of EA orgs saying they struggle to hire for ops, marketing and comms roles? As @SiobhanBall said, lots of people apply for these roles. Is it that the candidates aren't good enough or have higher salary expectations? That there are lots of applicants to some EA org but not others? That some orgs are willing to pay higher salaries than others? Geographical differences, e.g. higher pay in the US compared to for example the UK?

One example 

My impression is often they have a high bar and usually would like both having past experience in those fields and having a lot of context on EA/AI Safety + mission driven

SMBC by Zach Weinersmith is doing a great job of conveying AI Safety memes more widely.

Relevant comics: https://www.smbc-comics.com/comic/speech https://www.smbc-comics.com/comic/safe https://www.smbc-comics.com/comic/ai-17 https://www.smbc-comics.com/comic/ai-15

I would love to see his take on an illustrated AI Safety book, like 'Open Borders' meets 'If anyone builds it, everyone dies'.

Load more