Bio

Three passions, simple but overwhelmingly strong, govern my life: the longing for love, the desire to make my time on earth count, and unbearable pity for the suffering of all sentient beings. (To paraphrase Bertrand Russell.)

I'm looking for grantmaking roles in AI safety, AI x animals, and grantmaking infrastructure.

I hold an MSc in computer science, worked as a senior quantitative software engineer (16 years professional experience, 26 years total), have been in the charity space for 16 years, effective altruism for 12 years, and animal rights and AI safety for 11 years.

My top three missions are:

  1. Increase the “surface area” of AI safety,
  2. Support promising ideas to improve international and inter-AI coordination in the multipolar takeoff, and
  3. Improve the strategic positioning of AI safety funders and other decision-makers during the takeoff. 

I would like to pursue these and more proactively through incubation, research grants, and retroactive funding.

Previously, I launched a crowdsourced, market-based charity evaluator that efficiently finds and prefilters large numbers of giving opportunities under $100k; ran two charities whose purpose it was to fundraise through events, music, and art, and to grantmake for charities in animal rights and international development; founded EA Berlin; and worked for what is now the Center on Long-Term Risk.

You can get up to speed on my thinking at Impartial Priorities.

Sequences
3

Welfare Biology and AI
Impact Markets
Researchers Answering Questions

Comments
612

My current practical ethics

The question often comes up how we should make decisions under epistemic uncertainty and normative diversity of opinion. Since I need to make such decisions every day, I had to develop a personal system, however inchoative, to assist me.

A concrete (or granite) pyramid

My personal system can be thought of like a pyramid.

  1. At the top sits some sort of measurement of success. It's highly abstract and impractical. Let's call it the axiology. This is really a collection of all axiologies I relate to, including the amount of frustrated preferences and suffering across our world history. This also deals with hairy questions such as how to weigh Everett branches morally and infinite ethics.
  2. Below that sits a kind of mission statement. Let's call it the ethical theory. It's just as abstract, but it is opinionated about the direction in which to push our world history. For example, it may desire a reduction in suffering, but for others this floor needn't be consequentialist in flavor.
  3. Both of these abstract floors of the pyramid are held up by a mess of principles and heuristics at the ground floor level to guide the actual implementation.

The ground floor

The ground floor of principles and heuristics is really the most interesting part for anyone who has to act in the world, so I won't further explain the top two floors. 

The principles and heuristics should be expected to be messy. That is, I think, because they are by necessity the result of an intersubjective process of negotiation and moral trade (positive-sum compromise) with all the other agents and their preferences. (This should probably include acausal moral trades like Evidential Cooperation in Large Worlds.)

It should also be expected to be messy because these principles and heuristics have to satisfy all sorts of awkward criteria:

  1. They have to inspire cooperation or at least not generate overwhelming opposition.
  2. They have to be easily communicable so people at least don't misunderstand what you're trying to achieve and call the police on you. Ideally so people will understand your goal well enough that they want to join you.
  3. They have to be rapidly actionable, sometimes for split second decisions.
  4. They have to be viable under imperfect information.
  5. They have to be psychologically sustainable for a lifetime.
  6. They have to avoid violating laws.
  7. And many more.

Three types of freedom

But really that leaves us still a lot of freedom (for better or worse):

  1. There are countless things that we can do that are highly impactful and hardly violate anyone's preferences or expectations.
  2. There are also plenty of things that don't violate any preferences or expectations once we get to explain them.
  3. Finally, there are many opportunities for positive-sum moral trade.

These suggest a particular stance toward other activists:

  1. If someone is trying to achieve the same thing you're trying to achieve, maybe you can collaborate.
  2. If someone is trying to achieve something other than what you're trying to achieve, but you think their goals are valuable, don't stand in their way. In particular, it may sometimes feel like doing nothing (to further or hinder their cause) is a form of “not standing in their way.” But if your peers are actually collaborating with them to some extent, doing nothing (or collaborating less) can cause others to also reduce their collaboration and can prevent key threshold effects from taking hold. So the true neutral position is to try to understand how much you need to collaborate toward the valuable goal so it would not have been achieved sooner without you. This is usually very cheap to do and has a chance to get runaway threshold effects rolling.
  3. If someone is trying to achieve something that you consider neutral, the above may still apply to some extent because perhaps you can still be friends. And for reasons of Evidential Cooperation in Large Worlds. (Maybe you'll find that their (to you) neutral thing is easy to achieve here and that other agents like them will collaborate back elsewhere where your goal is easy to achieve.)
  4. Finally, if someone is trying achieve something that you disapprove of… Well, that's not my metier, temperamentally, but this is where compromise can generate gains from moral trade.

Very few examples

In my experience, principles and heuristics are best identified by chatting with friends and generalizing from their various intuitions.

  1. Charitable donations are total anarchy. Mostly, you can just donate wherever the fluff you want, and (unless you're Open Phil) no one will throw stones through your windows in retaliation. You can just optimize directly for your goals – except, Evidential Cooperation in Large Worlds will still make strong recommendations here, but what they are is still a bit underexplored.
  2. Even if you're not an animal welfare activist yourself, you're still well-advised to cooperate with behavior change to avert animal suffering to the extent expected by your peers. (And certainly to avoiding inventing phony reasons to excuse your violation of these expectations. These might be even more detrimental to moral progress and rationality waterline.)
  3. If you want to spend time with someone but they behave outrageously unempathetically toward you or someone else, you can cut ties with them even though, strictly speaking, this does not imply that no positive-sum trade is possible with them.
  4. Trying to systematically put people in powerful positions can arouse suspicion and actually make it harder to put people in powerful positions. Trying to systematically put people into the sorts of positions they find fulfilling might put as many people in powerful positions and make their lives easier too. (Or training highly conscientious people in how to dare to accept responsibility so it's not just those who don't care who self-select into powerful positions.)
  5. And hundreds more…

Various non-consequentialist ethical theories can come in handy here to generate further useful principles and heuristics. That is probably because they are attempts at generalizing from the intuitions of certain authors, which puts them almost on par (to the extent to which these authors are relateable to you) with generalizations from the intuitions of your friends.

(If you find my writing style hard to read, you can ask Claude to rephrase the message into a style that works for you.)

I once tried to oust someone who's like a mini version of Sam Altman and lost too. Made me feel a lot of kinship with Helen Toner when this happened.

Interesting! It doesn't seem too costly to implement these requirements.

I had to google Zakat though:

Zakat is a mandatory Islamic duty of almsgiving. Every sane, adult Muslim whose wealth exceeds a minimum threshold (called the Nisab) must donate a specific portion of their accumulated assets—typically 2.5%—to help the poor and needy. It is the third of the Five Pillars of Islam.

Surplus sounds useful!

I think everything hinges on the funding unfortunately…

Most of the projects on my list require some $200–500k in the first year to get started, and then can scale to a few million per year over time. The large-scale retrofunding needs to start higher – $10m might work, $100m works for the XPrize, $1b could be the goal.

The natural starting point is the incubator itself, which falls into the $200–500k range, but more towards the upper end to provide seed funding for the incubated projects.

Why did Guesstimate/Squiggle as for-profit not work out?

I'd love to have a call and catch up in any case! I'm curious whether you already have an opinion on whether places like DeepMind will be interested in paying for evals like the two types mentioned here (character and backdoors).

I'd like to throw my hat in the ring and indicate that I'd at least find it very interesting to take over for you to ensure that QURI's mission continues! I'm currently trying to get back into the AI safety grantmaking space, but that'll most likely fail, in which case I would welcome a plan B. 

I imagine that the grantmaking bottleneck is overblown – that a 100–1000x increase in grantmaking capacity is easily achievable through hiring obvious candidates (10x) and streamlining the processes through retroactive funding (10–100x). If the funds actually end up doing that, it'll be better again to contribute as a charity entrepreneur, and the plan B would become a plan A.

I'd prefer to expand QURI to projects that have less to do with quantification and more with ranking and clustering, and to adopt more of an incubator-like approach where successful projects turn into spin-offs with their own legal structure over time to introduce more resilience through redundancy. (And more Python.)

That'll probably require about $1m in funding over ~2 years. Is that realistic? Also I'm not sure if it clashes with your vision?

Thanks! Then I don't think I need to update my answers. I'm looking forward to your next batch of questions!

Thanks for surveying this! <3 

  1. I feel like people use “AI alignment” very different. When I talk to the types who are interested in decision theory and agent foundations, they usually have something really sophisticated in mind with AIs that somehow (no known solution because I'm not happy with any of the implementations of UDT that I've seen) try to act in such a way as to actually produce evidence that what they want to maximize will be maximized. Other people usually just mean something like “The AI tries to act sort of like a well-intentioned person would.” The first seems good but very very hard; the second seems outright dangerous, depending on details such as the particular idealizations that are applied.
  2. Hence questions like “AI alignment to humans will in practice avoid moral catastrophes …” is a strong no for me because it might not only not prevent but actually produce those catastrophes in the first place.
  3. Idealizations to eliminate the scope insensitivity bias and idealization to eliminate the speciesist and substratist biases are two different kinds of idealizations. My answer changes radically depending on whether they can be disentangled.
  4. Regarding tractability of digital minds work – I'm unsure whether I should count my worries about backfire risks as something that reduces scope or something that reduces tractability.
  5. Regarding the reflective equilibrium, it's critical to me whether we artificially study the TAI in isolation, which won't happen in practice, or whether we embed it with other, different agents. The first is probably meant; the second is more pragmatic.
  6. Control strikes me as safer, easier, and less reliable – a stopgap that can buy us a few years. I like that a lot more than an incomplete alignment solution that can backfire.
  7. Suffering risks – vastly more likely in the multipolar world we're steering towards – strike me as vastly worse than just competing away > 90% of net value, so my max. agree vote feels like an understatement. On the other hand, “will” is a higher probability than what I assign to s-risk (“might”).

I'm totally aligned in spirit and have a track record of starting project after project because no one else does. 

But I find that for most of the projects that I wish existed, you either need money or fame to get them started. I don't mean a bit of money for the rent but like $1m to incentivize downstream projects or the kind of fame that allows you to beat the cold start problem because it lets you motivate enough people to all try something new.

I'm putting my own projects on the backburner now to focus on applications that can put me in a position where I'm better able to fund them.

Ohhh! I love that post! Gotta link it to my excessively guilty friends! <3 

Over the past 12 years, I almost always avoided applying for any jobs in effective altruism – though they did often seem like dream jobs – because:

  1. I was afraid I might not be the best candidate, and if, by chance, I replace the best candidate, my work would not only be a waste but outright harmful. I'm the sort of person who's afraid to drive a car for fear of hurting someone, and the funding allocation can affect the lives of billions or trillions of beings, so any mistakes I could make would vastly outstrip any harm I could do with a car if I tried.
  2. That the best candidate might not make up for that harm in some other job that they do instead because they might be more socially motivated than me and not fall back on earning to give if they don't find a charity job but rather value drift and do some mainstream stuff – in the worst case AI capabilities.
  3. That I can survive many years of earning to give without value-drifting because I have managed to do that in the past (a USP I should capitalize on because the counterfactual of the money that I earn at a random company is very low impact, so I can generate great counterfactual impact rather than a bit of marginal impact that I'd get at a charity).
  4. That applying for a job, being considered the best candidate, and then not living up to the expectations would feel deeply humiliating – I'd feel like fraud, feel guilty for the harm I've caused, feel ashamed of having betrayed all the people at the organization, feel like I can never live it down or risk running into any of them ever again at conferences and such.
  5. That sometimes they end up hiring a world-renowned researcher like Carl Shulman, and then I'd feel ashamed of even having considered applying because just the thought of it already feels hubristic.

The upshot for me was:

  1. To apply to places that have the funding and management capacity to hire everyone above some bar.
  2. To time the boom and bust cycles in EA and go into grantmaking at the start of the boom cycles and ETG during the bust cycles.
  3. To trade off funding vs. management capacity, and try to contribute to grantmaking in a way that doesn't come at a cost in management capacity, e.g., not as employee, at times when management capacity is more limiting than funding. Or to create management capacity.
  4. To wait for someone to reach out to me unprompted to apply for a role, because then they've already decided that I might be a good fit and thus taken some of the terrible responsibility off my shoulders.
  5. A friend of mine does a lot of drunk driving and usually goes far above the speed limit. I sometimes meet with friends for optional pastime activities. The risk that I catch a cold and my performance is degraded after such a hangout is vastly more severe than the risk of my friend's drunk speeding because of all the hundreds of thousands of lives it might affect. Conversely, my friend hasn't killed a single person yet. So I'm in no position to judge her.

Meanwhile rejections were not a problem for me, so it's not really “rejection sensitivity.” I talked to my friends about how I'm expected to react to them, and their advice was helpful. If I had dared to apply for particularly responsible roles, a rejection would've been a relief. After all, rejection is a return to safety. It's just mixed with the shame over whatever mistakes I must've made in front of the interviewers. I considered not going to any conferences anymore where I might run into them, but my friends told me that's unnecessary. And it's true because when I decided not to hire people at my companies, I didn't want them to avoid me afterwards even if they've made mistakes in the interviews.

But more recently I've updated in the following ways:

  1. I heard from some hiring managers I'm friends with that even at their EA charities some applicants lie about their qualifications. It's like these people want to cause mayhem and destruction on a global scale by replacing better candidates. Naturally, there friends didn't hire the liars, but who knows how many slipped through and didn't get caught. I imagine these people are rare, so the probability that I'll replace someone better than me is higher but the severity of getting replaced by a liar is worse. So there's a tradeoff I didn't consider.
  2. It seems erroneous to think that someone can likely be a better candidate than me and yet value drift more easily than me. It's not impossible, but it's a strange convergence of mildly contradictory traits that I should thus have discounted. Besides, small-scale 1:1 compassion has a strong pull for me, so I run a risk of value-drifting away from high scalability impact to something like therapy.
  3. If I implicitly compare myself to Carl Shulman's polished outputs from the past decades, I set an unrealistically high bar and will necessarily feel lacking. It stands to reason that the actual best candidate will also make mistakes and that even Carl Shulman has made mistakes. Plus, there are not enough Carl Shulmans for every position at every organization.
  4. I've practiced imagining that somehow I've ended up committing horrendous war crimes on a global scale and how I would process the guilt and try to make it up to all the families I've destroyed. It helps me make my fears concrete like that and think through, step by step, how I would manage such a situation responsibly.
  5. If you're smart enough, self-deceptions won't be obviously false, sometimes not false at all, but they'll be selective and suspiciously purposeful.
  6. I have some information that HR doesn't have but HR also has some information that I don't have. Ideally, we collaborate to make the optimal decision.

Finally, for anyone struggling with similar difficulties in the face of overwhelming responsibility, here's a small example of someone processing his responsibility for a terrible accident.

Load more