I have been on a mission to do as much good as possible since I was quite young, and I decided to prioritize X-risk and improving the long-term future at around age 13. Toward this end, growing up I studied philosophy, psychology, social entrepreneurship, business, economics, the history of information technology, and futurism.
A few years ago I wrote a book draft I was calling “Ways to Save The World” or "Paths to Utopia" which imagined broad innovative strategies for preventing existential risk and improving the long-term future.
Upon discovering Effective Altruism in January 2022, while preparing to start a Master's of Social Entrepreneurship degree at the University of Southern California, I did a deep dive into EA and rationality and decided to take a closer look at the possibility of AI caused X-risk and lock-in, and moved to Berkeley to do longtermist research and community building work.
I am now researching "Deep Reflection," processes for determining how to get to our best achievable future, including interventions such as "The Long Reflection," "Coherent Extrapolated Volition," and "Good Reflective Governance."
Thank you for sharing Arden! I similarly have been thinking longtermism is an important crux for making AI go well, I think it’s very possible that we could avoid x-risk and have really good outcomes in the short-term, but put ourselves on a path where we predictably miss out on nearly all value in the long-term.
I really enjoyed this! Very important crux for how well the future goes. You may be interested to know that Nick Bostrom talks about this, he calls them super-beneficiaries.
I have been thinking that one solution to this could be people self-organizing and spending more of their off-time and casual hours working on these issues in self-organizing or crowd-sourced ways. Would be really curious to hear what your thoughts are on such an approach. I feel like there is enough funding that if people were able to collectively produce something promising, then this could really go somewhere. I have thought a lot about what kind of organizational structures would allow this;
Something like a weekly group meeting where people bring their best ideas and then discuss them and iteratively work on developing project ideas, media, research, and anything that could be high impact. Kind of like the EA Fellowship or other fellowships like Blue Dot and the Astra Fellowship, except for more decentralized and project-focused, and then having a coordination mechanism coordinating between the different groups in order to funnel the best projects from all of the groups to the top.
I have a pretty elaborate mechanism I designed in the past related to this for something else. But seems like it could work well here too. Don’t really have time to work on this myself that much right now. Perhaps unless I could get funding myself, which is ironically my own bottleneck I am primarily focused on right now.
But again, would be very curious to hear your thoughts on this kind of approach.
Interesting! I think I didn’t fully distinguish between two possibilities:
I think both types of AW are worth pursuing, but the second may be even more valuable, and I think this is the type I had in mind at least in scenario 3.
While there are different value functions, I believe there is a best possible value function.
This may exist at the level of physics, something to do with qualia that we don’t understand perhaps, and I think it would be useful to have an information theory of consciousness which I have been thinking about.
But ultimately, I believe that in theory, even if it’s not at the level of physics, I think you can postulate a meta-social choice theory which evaluates every possible social choice theory under all possible circumstance for every possible mind or value function, and find some sort of game theoretic equilibrium which all value functions and social choice theories for evaluating those functions and meta-social choice theories for deciding between choice theories converge on as the universal best possible set of moral principles—which I think is fundamentally about axiology; what moral choice in any given situation creates the most value across the greatest number of minds/entities/value functions/moral theories? I believe this question has an objective answer, there is actually a best thing to do, good things to do, and bad things to do, even if we don’t know what these are. Moral progress is possible, real, not a meaningless concept.
I very much agree that we need less deference and more people thinking for themselves, especially on cause prioritization. I think this is especially important for people who have high talent/skill in this direction, as I think it can be quite hard to do well.
It’s a huge problem that the current system is not great at valuing and incentivizing this type of work, as I think this causes a lot of the potentially highly competent cause prioritization people to go in other directions. I’ve been a huge advocate for this for a long time.
I think it is somewhat hard to systematically address, but I’m really glad you are pointing this out and inviting collaboration on your work, I do think concentration of power is extremely neglected and one of the things that most determines how well the future will go (and not just in terms of extinction risk but upside/opportunity cost risks as well.)
Going to send you a DM now!
I very much agree with this and have been struggling with a similar problem in terms of achieving high value futures, versus mediocre ones.
I think there may be some sort of a “Fragile Future Value Hypothesis,” somewhat related to Will MacAskill’s “No Easy Eutopia,” (and the essay which follows this one in the series) and somewhat isomorphic to “The Vulnerable World Hypothesis,” in which there may be many path dependencies, potentially leading to many low and medium value futures attractor states we could end up in, because, in expectation, we are somewhat clueless as to which crucial considerations matter, and if we act wrongly on any of those crucial considerations, we could potentially lose most or even nearly all future value.
I also agree that making the decisionmakers working on AI highly aware of this could be an important solution, I’ve been thinking that the problem isn’t so much that people at the labs don’t care about future value, they are often quite explicitly utopian, it just seems to me that they don’t have much awareness of the fact that near-best futures might actually be highly contingent and very difficult to achieve, and the illegibility of this fact means that they are not really trying to be careful about which path they set us on.
I also agree that trying to get advanced AI working on these types of issues as soon it is able to meaningfully assist could be an important solution and intend to start working on this as one of my main objectives— although I’ve been a bit more focused on macrostrategy than philosophy because I think this might be a bit more feasible for current or near-future AI, and if we get in the right strategic position then maybe that could position us to figure out the philosophy stuff which I think is going to be a lot harder for AI.