I had a question. Why do all the AI safety companies seem to do the opposite of AI safety? Anthropic keeps publicly releasing models (which means they can be accessed by billions of people), same for OpenAI, and while these models are unlikely to cause major problems, if you're releasing a product that is going to be used by billions of people you should make sure the product is around 99.9999% failure proof. Anthropic themselves have said "AI models have reached a level of coding capability where they can surpass all but the most skilled humans at finding...
A recurring sub-theme across multiple of my research interests this year have been various forms of deception checking, particularly automated deception checking.
I've gotten pretty disappointed in the space. Not all the time (eg Pangram is great), but consistently they can be bad, and bad in ways that are not obvious to outsiders or low-information buyers.
If you're a deception checking company, there's a consistent tradeoff for what you can invest your resources in:
I’m willing to bet that Anthropic’s revenue growth over the next year will be slower than its revenue growth over the last 3 years. I proposed a specific bet here. Anyone who wants can offer to take the other side of that bet. Or you can make a counteroffer.
I’m also willing to make a longer-term bet that the AI industry is in a bubble. I proposed a specific bet for that, too, here. Feel free to offer to take the other side of that bet or make a counteroffer.
I’d also be open to other bets. It seems pointless to bet about whether AGI or tr...
I often hear the suggestion that people should short stock when they don't believe in a company. I don't think that it is a very good piece of advice.
Shorting is notoriously difficult and carries the possibility of unlimited loss. Even if you believe that a stock will crash, small errors such as timing the crash one year too early or using a miscalculated stop order to stop shorting too early can lead to massive losses. Determining the actual risk involved is not very straightforward. Shorting is often accompanied by risk-managing tactics such as hedging t...
Lots of people in the Bay seem to be thinking about/preparing for/making funding decisions based on the idea that lots of philanthropy will be given to AIS/EA cause areas very soon (i.e. end of year-ish). I would love for someone to write the comprehensive steel man case against this, as I think it’s probably underrated (some reasons to think they won’t give the money/it won’t be as much as some assume. Happy to comment/ speak to whoever is interested in doing this.
I have so much intended writing on my plate and, unfortunately, little time to do it.
In a nutshell, I think some employees are dedicated EAs who will give a lot of money to causes I deeply care about. Some of them have a binding agreement to donate a portion of their stock (and a match).
I don't trust the founders. There is no legal mechanism I know of that binds them. The base rate of people effectively giving away large sums of wealth, even when they said they would, is shockingly low. They have said nothing EA-related in years apart from distancing thems...
Hey everyone, my name is Jacques, I'm an independent technical alignment researcher (primarily focused on evaluations, interpretability, and scalable oversight). I'm now focusing more of my attention on building an Alignment Research Assistant. I'm looking for people who would like to contribute to the project. This project will be private unless I say otherwise.
Side note: I helped build the Alignment Research Dataset ~2 years ago. It has been used at OpenAI (by someone on the alignment team), (as far as I know) at Anthropic for evals, and is now used as t...
Hi Jacque, what sort of mechanisms have you researched so far that may be effective?
Interested to hear your conclusions so far, because AI Alignment is often spoke of by way of external legislative enforcement, corporate change, but rarely addressed at the fundamental base (mathematical layer in which reasoning is tracked), as many of the well-known models hire employees to work on linear regression and some KNN and then let them go once their job is done.
Me and @Fran are co-hosting a podcast, The World Can Be Better! (hosted on Substack, also on Spotify, Apple Podcasts, YouTube)
In our first co-hosted episode we interview @finm about better futures, the intelligence explosion, and Fin's underrated post 'No ghost in the machine'.
We'll be recording more episodes soon. We're aiming to a) keep it accessible to an interested but non-specialist audience, and b) to talk to cool people who are working on making the world better. This is on top of full time jobs for both of us, so we're not promising weekly up...
TLDR:
(spoilers follow, you can also read here)
This book is the equivalent of seeing someone wear a lovely, thick, warm, knit sweater on a sunny beach. The sweater plot is so nice, but I keep thinking it would really be more at home in a different location genre.
This book makes no sense as a sci-fi(-ish) novel. For context, it is about a group of clones, created so they can eventually become organ donors. Unfortunately, this premise falls apart if you think about it for even a minute.
How were...
I sometimes hear complaints from non-native English speakers about how banning undisclosed LLM use in writing is unfair.
Possible pro-tip for non-native English speakers who want to write well but don't want to sound like AI: Just write an article you want to write in your native language, polish it until you're proud of it in your native language, and then ask a frontier LLM (Opus 4.8, Gemini 3.1 Pro, ChatGPT 5.5 Pro) to translate it to English, while reasonably adhering to your original intent and writing.
In my experience and tests, the LLMs are sufficien...
Why isn’t there a shrine to the Unknown Donor at Roland-Garros?
Shot: “Rafa gave absolutely everything. He gave us everything he had.”
Chaser: “I gave my life to one of the most individual sports that exists.”
Rafa Nadal isn’t the GOAT, and neither are Federer or Djokovic.
Someone (other than me) should write a deep-dive post about the cult Leverage Research and its infiltration of effective altruism.
The story, in brief:
I think it’s worth noting that Larissa and Kerry have denied being involved with Leverage until after they departed CEA.
There is a thread here where Kerry (now deleted) makes claims on his side of this story.
Hi Community, here's a quick take I have been thinking for a while:
Animal Sentience
The question of whether animals are sentient or conscious remains controversial, largely because it is an epistemological challenge: we cannot directly access the subjective experiences of other beings. The term consciousness is particularly loaded, carrying strong anthropocentric assumptions that often limit meaningful discussion outside the human context. In contrast, sentience provides a more useful and flexible framework, as it allows for different forms and degrees of s...
Snopes did pretty detailed secondary reporting on my analysis of AI use in the recent encyclical.
I think it's pretty good. Covers some stuff I didn't include in my original analysis, and their conclusion was similar to mine, maybe slightly less strong.
Less technical than my post, and imo not as funny, but also significantly shorter (1600 words), includes some replications and also added some details I didn't know as of time of writing.
Overall a good piece, potentially worth reading/skimming either in addition to or instead of my original analysis.
The origins of the rumor
On May 26, tech news website The Verge published a brief article with the headline, "Did the Pope use AI to write about the dangers of AI?" The Verge's X post (archived) promoting its article received millions of views.
It annoys me that they cite the Verge article as the "origin", rather than your article that the Verge article was based on.
One reflection I've had in the whole "AI use in the encyclical" affair is to slightly increase my trust in traditional media, especially non-American traditional media, and slightly decrease my trust in social media/new media.
I tried my best to promote my analysis as legibly and reasonably as I could and focused on logos rather than ethos: I didn't frame my article with institutional affiliations and intentionally chose not to include obvious, flashy, but irrelevant signaling. Stuff I could've done but explicitly chose not to: get an ML professor to cosign...
Small update: Life Site News covered it in nontrivial detail. Moderately faithful. Never heard of them before but Wikipedia classifies it as a "far-right pro-life Catholic publication," so I'd count it as closer to the "alt-media" side of the "traditional media <> alt-media" spectrum than Russia Today, which was previously the most "alt-media" I've seen of the big coverage so far.
I started research into farmed animal welfare in Muslim countries and I think this is a useful way to share little updates along the way, and also to track any ideas I come up with so I can refer back to them when I need to compile my findings. Because I'm also working on a grant looking into effective Zakat, and I think I'll end up doing the same thing for that, I'm going to be numbering farmed animal welfare quick takes with FAW# and Effective Zakat quick takes with EZ#.
so.
FAW#1.
Before starting with this project, I was operating under the assumption that...
Ah, influencing certifiers sounds interesting, would love to take a look at the paper! :)
Also looking forward to your survey when it's ready!
I've been curious lately about Muslims' attitudes towards stunning and slaughter practices, and especially what kinds of stunning would be acceptable, in case we wanted to promote more stunning or even do more R&D for halal stunning.
Lots of EA orgs say they struggle to hire for ops, marketing and comms. But when they post listings for these roles the salaries are often much lower than what they offer for research and engineering, which they generally find easier to fill. My guess is that orgs are just benchmarking against normal market rates for these roles. But the EA labour market is very different to the normal labour market, if these roles are undersupplied inside EA, I think orgs should be willing to pay more for them.
SMBC by Zach Weinersmith is doing a great job of conveying AI Safety memes more widely.
Relevant comics: https://www.smbc-comics.com/comic/speech https://www.smbc-comics.com/comic/safe https://www.smbc-comics.com/comic/ai-17 https://www.smbc-comics.com/comic/ai-15
I would love to see his take on an illustrated AI Safety book, like 'Open Borders' meets 'If anyone builds it, everyone dies'.