Hi Fin,
I have a lot of questions so I figure I would just share all of them and you could respond to the ones you want to.
- I think Forethought is a super cool institution. What advice would have for someone who wanted to work there as a researcher? Do you think it's important to have a strong understanding of how LLMs work?
- I made this post where I categorized flourishing cause areas based on "How To Make The Future Better." I thought I'd share. I'm curious if this categorization generally aligns with how you think about the problem.
- Locking-in one’s values
- Ensuring the future is aligned with the correct values
- Working towards viatopia
- Promoting futures with more moral reflection
- Improving the ability for people with different views to get their desired futures
- Ensuring future people are able to create a good future
- Keeping humanity’s options open
- Improving global stability
- Improving future human’s decision making
- Empowering responsible actors
- Speeding up progress
- I made this post which is an overview of longtermism's ideas, writings, individuals, institutions, and history. I thought I'd share since you made the longtermism website.
- The Better Futures series assumes that the future will be net-positive by default. To me, the ideas presented in the series (strong self-modification, modification of descendants, selection of beliefs by evolutionary pressures) indicate that we should expect future humans to be very different from us, and that, as a result, we should expect the future to be neutral in expectation. Do you agree with this logic or do you think the future will be net-positive by default? Additionally, why?
- Currently, there are a wide range of ideas about how a post-AGI future will go and what features it will contain. To me, this strongly indicates that we should expect the post-AGI future could go in a very broad range of ways and that we should prepare for the many different ways it could go. At the same time, I get the sense that Forethought has a very specific vision about how a post-AGI future will go (there will be an intelligence explosion, tools for epistemics will be beneficial, we might begin acquiring resources in other solar systems, small sets of actors could use AGI in malicious ways.) I'm wondering how do you decide what ideas you think are likely, and do you guys have any measures in place to ensure you're receiving criticism of your ideas so you don't create an epistemic bubble?
- I understand that you have done some work related to space governance. A criticism I have of working on this field is that (1) it seems like it has been very intractable due to the lack of space treaties (2) if any great power has a decisive advantage, global treaties won't matter (3) even if you are able to get a law or treaty passed, corporate or state interests could easily override these laws later on (4) there's probably a low chance of success of even getting into a position where you could influence this stuff. As such, I'm wondering, if you think it's valuable for additional people to work in the field, why do you think this?
- It seems like longtermism is an unhelpful idea since it requires people to believe that our actions could persist for millions of years. I personally am pretty skeptical of this, although I do think it is possible. It also seems like the idea has been somewhat harmful to EA as a movement since people can always point out that some of the founders of the movement are focused on helping people millions of years from now, which sounds pretty crazy. I'm wondering if you agree with this assessment.
- In "How To Make The Future Better," MacAskill argues that we should make AIs encourage humans to be good people and use them as a source of moral reflection. This seems like it could be deeply problematic in case moral sense theory is true, but AIs lack a moral sense. Do you agree with this?
Given this combination of views, I'm surprised that Will doesn't support what @Holly Elmore ⏸️ 🔸 calls "Pause NOW" and instead want to see a pause later (after we have human-level AI). I'm curious if your own views are similar or how they differ from Will's. (My own "expected value of the future, given survival" I would say is similarly pessimistic, but I'm reluctant to put into numbers due to being very unsure how to quantify it.)
Aside from what Holly said in the linked comment, which I agree with, another argument more relevant to the current discussion is that many opportunities for making the future better seem to exist during the AI transition, including the early parts of it, so by not pausing ASAP (and currently having few resources for such interventions), we're permanently giving up these opportunities. Conversely, by pausing NOW, we buy more time to think and strategize about how to better intervene on these opportunities, or otherwise lay the groundwork for them.
For example, during the pause, we could:
Such interventions could mean the difference between the first human-level AIs being competent and critical moral/philosophical advisors, or independent moral (and safe) agents, vs uncritically doing what humans seem to want and/or giving bad/incompetent/sycophantic "advice" (when humans think to ask for it), which seemingly can make a big difference to how well the future goes.
What do you think about this argument, and overall about pause now vs later?
Thanks for this.
In each the examples you give, i'm thinking that the pause would be significantly more beneficial (plausibly by 10x) if we pause when AI is already capable enough that it can significantly help us solve the issue. In general, they seem like the kinds of issues where AI could massively accelerate progress.
So if i'm choosing between international pause now vs international pause in 2 years, I choose the latter. (I assume we're talking about international pauses here rather than just the U.S. but lmk if you also support a unilateral pause now!)
I do find Holly's point that it might be damaging to quibble about exactly when we pause if that reduces the chance of a pause happening at all. And today we are very far from a pause actually happening, and one may well be needed in two years' time, so I def support efforts to get us closer to a pause!
I'm hesitant about saying "pause now" because I actually think a different policy might be much more effective. But I think a world where we were about to do an international pause would be better than the actual world.
(I want to think more about this topic and all of this is v tentative.)
Why assume that there can only be one pause? Pausing now could make a later pause both more likely and more useful, by building the infrastructure and precedent for pausing, and by making subsequent AIs more aligned and differentially more productive in areas that we care about. If we end the first pause only after we've solved the problem of building aligned AIs that are philosophically and strategically competent, that would seemingly make subsequent pauses much easier.
I wonder if you're thinking that we won't be able to pause long enough to make significant progress on these problems? I can see that if we only have the "willpower" for a single short pause, then it becomes unclear when to best use it.
I have been warning for several years that AI could be differentially bad at philosophy and long-horizon strategy (due in part to AI training requiring massive amounts of training data and/or fast and cheap feedback loops, which are lacking for these fields, and in part to lack of understanding of e.g. metaphilosophy). So if we don't pause now (and use the time to fix this issue) then by the time we do pause, we'll likely have AIs that can accelerate other fields (such as math/coding/science/tech and manipulating humans) much more than the fields that are crucial for Better Futures.
Worse, we may end up with AIs that decelerate (in an absolute sense) hard-to-verify fields like philosophy and long-horizon strategy, because these AIs are better at coming up with plausible sounding ideas and arguments, and convincing humans of their truth, or persuading humans that their own bad ideas are actually good (which is already being reported under "AI psychosis" and "sycophancy"), than making real progress in these fields.
Sorry for the slow reply!
Thanks, this is a helpful perspective.
I've normally thought from a frame of "we've got limited chips to spend on pausing, when is it best to spend them". I think this frame is reasonable if you're worried about irresponsible developers catching up or tradeoffs with the current gen's desire to survive.
But it is true that a pause today might make a pause in the future more likely.
Otoh, it could also make it less likely if ppl perceive that nothing concretely useful comes out of it, which is my worry with pausing today. Like, i think ~nothing useful would have come from pausing shortly after GPT-4 was released.
Do you think this is possible with today's AI capabilities? I'd have thought you can't match human philosophy and strategy yet, but we are def getting closer.
Also, how do you think about whether to slow down vs pause, holding fixed the total delay relative to 'full speed ahead'? I'd have thought slow down is better re iterating on alignment as problems arise and re building philosophically competent AIs.
Interesting. I normally expect AI to accelerate philosophy and strategy less than the math/coding but more than science/tech. Science/tech rely on experimental bottlenecks, whereas for philosophy the only input is cognitive labour. But you're right, if AI can't do philosophy/strategy properly, it won't speed it up at all! So far, AI systems have been pretty good at these skills though?