Bio

Participation
3

Pause AI / Veganish

Lets do a bunch of good stuff and have fun gang!

How others can help me

I am always looking for opportunities to contribute directly to big problems and to build my skills. Especially skills related to research, science communication, and project management.

Also, I have a hard time coping with some of the implications of topics like existential risk, the strangeness of the near term future, and the negative experiences of many non-human animals. So, it might be nice to talk to more people about that sort of thing and how they cope.

How I can help others

I have taken BlueDot Impact's AI Alignment Fundamentals course. I have also lurked around EA for a few years now. I would be happy to share what I know about EA and AI Safety.

I also like brainstorming and discussing charity entrepreneurship opportunities.

Comments
20

I think the focus is generally placed on the cognitive capacities of AIs because it is expected that it will just be a bigger deal overall. 

There is at least one 80,000 hours podcast episode on robotics. It tries to explain why it's hard to do ML on, but I didn't understand it.

Also, I think Max Tegmark wrote some stuff on slaughterbots in Life 3.0. Yikes!

You could try looking for other differential development stuff too if you want. I recently liked: AI Tools for Existential Security. I think it's a good conceptual framework for emerging tech / applied ethics stuff I think. Of course, still leaves you with a lot of questions :)
 

I love to see stuff like this!

It has been a pleasure reading this, listening to your podcast episode, and trying to really think it through,

This reminds me of a few other things I have seen lately like Superalignment, Joe Carlsmith's recent "AI for AI Safety", and the recent 80,000 Hours Podcast with Will McAskill. 

I really appreciate the "Tools for Existential Security" framing. Your example applications were on point and many of them brought up things I hadn't even considered. I enjoy the idea of rapidly solving lots of coordination failures. 

This sort of DAID approach feels like an interesting continuation on other ideas about differential acceleration and the vulnerable world hypothesis. Trying to get this right can feel like some combination of applied ethics and technology forecasting. 

Probably one of the weirdest or most exciting applications you suggest is AI for philosophy. You put it under the "Epistemics" category. I usually think of epistemics as a sub-branch of philosophy, but I think I get what you mean. AI for this sort of thing remains exciting, but very abstract to me. 

What a heady thing to think about; really exciting stuff! There is something very cosmic about the idea of using AI research and cognition for ethics, philosophy, and automated wisdom. (I have been meaning to read "Winners of the Essay competition on the Automation of Wisdom and Philosophy"). I strongly agree that since AI comes with many new philosophically difficult and ethically complex questions, it would be amazing if we could use AI to face these.

The section on how to accelerate helpful AI tools was nice too. 

Appendix 4 was gold. The DPD framing is really complimentary to the rest of the essay. I can totally appreciate the distinction you are making, but I also see DPD as bleeding into AI for Existential Safety a lot as well. Such mixed feelings. Like, for one thing, you certainly wouldn't want to be deploying whack AI in your "save the world" cutting edge AI startup. 

And it seems like there is a good case for thinking about doing better pre-training and finding better paradigms if you are going to be thinking about safer AI development and deployment a lot anyways. Maybe I am missing something about the sheer economics of not wanting to actually do pre-training ever. 

In any case, I thought your suggestions around aiming for interpretable, robust, safe paradigms were solid. Paradigm-shaping and application-shaping are both interesting.

***

I really appreciate that this proposal is talking about building stuff! And that it can be done ~unilaterally. I think that's just an important vibe and an important type of project to have going.

I also appreciate that you said in the podcast that this was only one possible framing / clustering. Although you also say "we guess that the highest priority applications will fall into the categories listed above" which seems like a potentially strong claim. 

I have also spent some time thinking about which forms of ~research / cognitive labor would be broadly good to accelerate for similar existential security reasons and I kind of tried to retrospectively categorize some notes I had made with your framing. I had some ideas that were hard to categorize cleanly into epistemics, coordination, or direct risk targeting. 

I included a few more ideas for areas where AI tools, marginal automated research, and cognitive abundance might be well applied. I was going for a similar vibe, so I'm sorry if I overlap a lot. I will try to only mention things you didn't explicitly suggest:

Epistemics:

  • you mention bench-marking as a strategy for accelerating specific AI applications, but it also deserves mention as an epistemic tool.  METR style elicitation tests too
    • I should say up front that I don't know if, due to acceleration and iteration effects you mention, eg. FrontierMath and lastexam.ai are explicitly "race-dynamic-accelerating in a way that overshadows their epistemic usefulness; even METR's task horizon metric could be accused here.
    • From a certain perspective, I would consider benchmarks and METR elicitation tests natural compliments to mech interp and AI capabilities forecasting
    • this would include capabilities and threats assessment (hopefully we can actively iterate downwards on risk assessment scores)
  • broad economics and societal impact research
    • the effects of having AI more or less "do the economy" seem vast and differentially accelerating understanding and strategy there seems like a non-trivial application relevant to the long term future of humanity
    • wealth inequality and the looming threat of mass unemployment (at minimum, this seems important for instability and coordination reasons even if one were too utilitarian / long termist to care for normal reasons)
  • I think it would be good to accelerate "Risk evaluation" in a sense that I think was defined really elegantly by Joe Carlsmith in "Paths and waystations in AI safety" [1]
    • building naturally from there, forecasting systems could be specifically applied to DAID and DPD; I know this is a little "ouroboros" to suggest but I think it works


Coordination-enabling:

  • movement building research and macro-strategy, AI-fueled activism, political coalition building, AI research into and tools for strengthening democracy
  • auto research into deliberative mini-publics, improved voting systems (eg. ranked choice, liquid democracy, quadratic voting, anti-gerrymandering solutions), secure digital voting platforms, improved checks and balances (eg. strong multi-stakeholder oversight, whistleblower protections, human rights), non-censorship oriented solutions to misinformation
     

Risk-targeting:

  • I know it is not the main thrust of "existential security", but I think it worth considering the potential for "abundant cognition" to welfare / sentience research (eg. bio and AI). This seems really important from a lot of perspectives, for a lot of reasons:

    • AI Safety might be worse if the AIs are "discontent"
    • we could lock in a future where most people are suffering terribly which would not count as existential security
    • it seems worthwhile to know if the AI workers are suffering ASAP for normal "avoid doing moral catastrophes" reasons
    • we could unlock huge amounts of welfare or learn to avoid huge amount of pain; (cf. "hedonium" or the Far Out Initiative)

    That said, I have not really considered the offense / defense balance here. We may discover how to simulate suffering for much cheaper than pleasure or something horrendous like that. Or there might be info hazards. This space seems so high stakes and hard to chart.

Some mix:

  • Certain forms of monitoring and openly researching other people's actions seem like a mix of epistemics and coordination. For example, I had listed some stuff about ie. AI for broadly OSINT-based investigative journalism, AI lab watch, legislator scorecards, and similar. These are kind of information for the sake of coordination. 


I know I included some moonshots. This all depends on what AI systems we are talking about and what they are actually helpful with I guess. I would hate for EA to bet too hard on any of this stuff and accidentally flood the zone of key areas with LLM "slop" or whatever.

Also, to state the obvious, there may be some risk of correlated exposure if you pin too much of your existential security with the crucial aid of unreliable, untrustworthy AIs. Maybe Hal 9000 isn't always the entity to trust with your most critical security. 

Lots to think about here! Thanks! 

 

  1. ^

    Joe Carlsmith: "Risk evaluation tracks the safety range and the capability frontier, and it forecasts where a given form of AI development/deployment will put them.

    • Paradigm examples include:
      • evals for dangerous capabilities and motivations;
      • forecasts about where a given sort of development/deployment will lead (e.g., via scaling laws, expert assessments, attempts to apply human and/or AI forecasting to relevant questions, etc);
      • general improvements to our scientific understanding of AI
      • structured safety cases and/or cost-benefit analyses that draw on this information."

I have a few questions about the space of EA communities.

You mention 

Projects that build communities focused on impartial, scope-sensitive and ambitious altruism.

as in scope. I am curious what existing examples you have of communities that place emphasis on these values aside from the core "EA" brand? 

I know that GWWC kind of exists as it's own community independent of "EA" to ~some extent, but honestly I am unclear to what extent. Also, I guess LessWrong and the broader rationality-cinematic-universe might kind of fit here too, but realistically whenever scope sensitive altruism is the topic of discussion on LessWrong an EA Forum cross-post is likely. Are there any big "impartial, scope-sensitive and ambitious altruism" communities I am missing? I know there are several non-profits independently working on charity evaluation and that sort of thing, but I am not very aware of distinct "communities" per say?

Some of my motivation for asking is that I actually think there is a lot of potential when it comes to EA-esque communities that aren't actually officially "EA" or "EA Groups". In particular, I am personally interested in the idea of local EA-esque community groups with a more proactive focus on fellowship, loving community, social kindness/fraternity, and providing people a context for profound/meaningful experiences. Still championing many EA-values (scope-sensitivity, broad moral circles, proactive ethics) and EA tools (effective giving, research oriented, and ethics-driven careers), but in the context of a group which is a shade or two more like churches, humanist associations, and the Sunday Assembly and a shade or two less like Rotary Clubs or professional groups. 

That's just one idea, but I'm really trying to ask about the broader status of EA-diaspora communities / non-canonically "EA" community groups under EAIF? I would like to more clearly understand what the canonical "stewards of the EA brand" in CEA and the EAIF have in mind for the future of EA groups and the movement as a whole? What does success look like here; what are these groups trying to be / blossom into? And to the extent that my personal vision for "the future of EA" is different, is a clear-break / diaspora the way to go?

Thanks!
 

I've seen EA meditation, EA bouldering, EA clubbing, EA whatever. Orgs seem to want everyone and the janitor to be "aligned". Everyone's dating each other. It seems that we're even afraid of them.

 

I am not in the Bay Area or London, so I guess I'm maybe not personally familiar with the full extent of what you're describing, but there are elements of this that sound mostly positive to me. 

Like, of course, it is possible to overemphasize the importance of culture fit and mission alignment when making hiring decisions. It seems like a balance and depends on the circumstance and I don't have much to say there. 

As far as the extensive EA fraternizing goes, that actually seems mostly good. Like, to the extent that EA is a "community", it doesn't seem surprising or bad that people are drawn to hang out. Church groups do that sort of thing all the time for example. People often like hanging out with others with shared values, interests, experiences, outlook, and cultural touchstones. Granted, there are healthy and unhealthy forms of this. 

I'm sure there's potential for things to get inappropriate and for inappropriate power dynamics to occur when it comes to ambiguous overlap between professional contexts, personal relationships, and shared social circles. At their best though, social communities can provide people a lot of value and support. 

Why is "EA clubbing" a bad thing?

I think the money goes a lot further when it comes to helping non human animals then when it comes to helping humans.

I am generally pretty bought into the idea that non human animals also experience pleasure/suffering and I care about helping them.

I think it is probably good for the long term trajectory of society to have better norms around the casual cruelty and torture inflicted on non-human animals.

On the other hand, I do think there are really good arguments for human to human compassion and the elimination of extreme poverty. I am very in favor of that sort of thing too. GiveDirectly in particular is one of my favorite charities just because of the simplicity, compassion, and unpretentiousness of the approach.

Animal welfare wins my vote not because I disfavor human to human welfare, but just because I think that the same amount of resources can go a lot further in helping my non human friends.

If 'how do you deal with it' means 'how do you convince yourself it is false, or that things some EA orgs are contributing to are still okay given it', I don't think this is a useful attitude to have towards troubling truths.

 

Well said and important :)

I don't really understand this stance, could you explain what you mean here?

Like Sammy points out with the Hitler example, it seems kind of obviously counterproductive/negative to "save a human who was then going to go torture and kill a lot of other humans".

Would you disagree with that? Or is the pluralism you are suggesting here specifically between viewpoints that suggest animal suffering matters and viewpoints that don't think it matters?

As I understand worldview diversification stances, the idea is something like: if you are uncertain about whether animal welfare matters, then you can take a portfolio approach where with some fraction of resources, you try to increase human welfare at the cost of animals and with a different fraction of resources you try to increase animal welfare. The hope being that this nets out to positive in "world's where non-human animals matter" and "world's where only humans matter".

Are you suggesting something like that or is there a deeper rule against "not concluding that the effects of other people's lives are net negative" when considering the second order effects of whether to save them that you are pointing to?

Note that the cost-effectiveness of epidemic/pandemic preparedness I got of 0.00236 DALY/$ is still quite high.


Point well-taken. 

I appreciate you writing and sharing those posts trying to model and quantify the impact of x-risk work and question the common arguments given for astronomical EV.

I hope to take a look at those more in depth some time and critically assess what I think about them. Honestly, I am very intrigued by engaging with well informed disagreement around the astronomical EV of x-risk focused approaches. I find your perspective here interesting and I think engaging with it might sharpen my own understanding.

:)

 

Interesting! This is a very surprising result to me because I am mostly used to hearing about how cost effective pandemic prevention is and this estimate seems to disagree with that.

Shouldn't this be a relatively major point against prioritizing biorisk as a cause area? (at least w/o taking into account strong long termism and the moral catastrophe of extinction)

Load more