titotal

Computational Physicist
8845 karmaJoined

Bio

I'm a computational physicist, I generally donate to global health.  I am skeptical of AI x-risk and of big R Rationalism, and I intend on explaining why in great detail. 

Comments
741

Drexlers previous predictions seem to have gone very poorly. This post evaluated the 30 year predictions of a group of seven futurists in 1995, and Drexler came in last, predicting that by 2026 we would have complete drexlerian nanotech assemblers, be able to reanimate cryonic suspendees, have uploaded minds, and have a substantial portion of our economy outside the solar system. 

Given this track record of extremely poor long-term prediction, why should I be interested in the predictions that Drexler makes today? I'm not trying to shit on Drexler as a person (and he has had a positive influence in inspiring scientists), but it seems like his epistemological record is not very good. 

I'm broadly supportive of this type of initiative, and it seems like it's definitely worth a try (the downsides seem low compared to the upsides). However I suspect that, like most apparently good ideas, scrutiny will yield problems. 

One issue I can think of: in this analysis, a lot of the competitive advantage for the company arises from the good reputation of the charitable foundation running it. However, running a large company competitively sometimes involves making tough, unpopular decisions, like laying off portions of your workforce. So I don't think your assumption that the charity-owned company can act exactly like a regular company holds up necessarily: doing so risks eliminating the reputational advantage that is needed for the competitive edge. 

titotal
10
2
3
1
1

I have many disagreements, but I'll focus on one: I think point 2 is in contradiction with points 3 and 4. To put it it plainly: the "selection pressures" go away pretty quickly if we don't have reliable methods of knowing or controlling what the AI will do, or preventing it from doing noticeably bad stuff.  That applies to the obvious stuff like if AI tries to prematurely go skynet, but it also applies to more mundane stuff like getting an AI to act reliably more than 99% of the time. 

I believe that if we manage to control AI enough to make widespread rollout feasible, then it's pretty likely we've already solved alignment well enough to prevent extinction. 

I'm not super excited about revisiting the model, to be honest, but I'll probably take a look at some point. 

What I'd really like to see, and what I haven't noticed from a quick look through the update, is some attempt to prove the validity of the models with reference to actual data. For example, I think METR comes off looking pretty good right now with their exponential model of horizon growth, which has held up for nearly a year post-publication now. The AI2027 model's prediction of superexponential growth has not. So I think they have to make a pretty strong case for why I should trust the new model. 

I think the problem here is that novel approaches are substantially more likely to be failures due to being untested and unproven. This isn't a big deal in areas where you can try lots of stuff out and sift through them with results, but in something like an election you only get feedback like once a year or so. Worse, the feedback is extremely murky, so you don't know if it was your intervention or something else that resulted in the outcome you care about. 

One other issue I thought of since my other comment: you list several valid critiques that the AI made that you'd already identified, but were not in the provided training materials. You state that this gives additional credence to the helpfulness of the models:

three we were already planning to look into but weren't in the source materials we provided (which gives us some additional confidence in AI’s ability to generate meaningful critiques of our work in the future—especially those we’ve looked at in less depth).

However, just because the critique is not in the provided source materials, it doesn't mean that it's not in the wider training data of the LLM model. So for example, if Givewell talked about the identified issue of "optimal chlorine doses" in a blog comment or something, and that blog site got scraped into the LLM, then the critique is not a sign of LLM usefulness: they may just be parroting back your own findings to you. 

Overall this seems like a sensible, and appropriately skeptical, way of using LLM's in this sort of work. 

In regards to improving the actual AI output, it looks like there is insufficient sourcing of claims in what it puts out, which is going to slow you down when you actually try and check the output. I'm looking at the red team output here on water turpidity. This was highlighted as a real contribution by the AI, but the output has zero sourcing on it's claims, which presumably made it much harder to actually check for validity. If you were to get this critique from a real, human, red-teamer, they would make it signficantly more easy to check that the critique was valid and sourced.

One question I have to ask is whether you are measuring how much time and effort is being extended into managing the output of these LLM's and sifting out the actually useful recommendations? When assessing whether the techniques are a success, you have to consider the counterfactual case where that time was replaced by human research time looking more closely at the literature, for example. 

I would not describe the finetuning argument and the Fermi paradox as strong evidence in favour of the simulation hypothesis. I would instead say that they are open questions for which a lot of different explanations have been proposed, with the simulation offering only one of many possible resolutions. 

As to the "importance" argument, we shouldn't count speculative future events as evidence of the importance of now. I would say the mid-20th century was more important than today, because that's the closest we ever got to nuclear annihilation (plus like, WW2). 

titotal
23
9
1
1
2
2

I'd like to see more outreach to intellectual experts outside of the typical EA community. I think there are lots of people with knowledge and expertise that could be relevant to EA causes, but who barely know that it exists, or have disagreements with fundamental aspects of the movement. Finding ways to engage with these people could be very valuable to get fresh perspectives and it could help grow the community. 

I don't know how exactly to do this, but maybe something like soliciting guest posts from professors or industry experts, or AMA style things or dialogues. 

Before I can or should try to write up that take, I need to fact-check one of my take-central beliefs about how the last couple of decades have gone down.  My belief is that the Open Philanthropy Project, EA generally, and Oxford EA particularly, had bad AI timelines and bad ASI ruin conditional probabilities; and that these invalidly arrived-at beliefs were in control of funding, and were explicitly publicly promoted at the expense of saner beliefs.

 

We don't know if AGI timelines or ASI ruin conditional probabilities are "bad", because neither event has happened yet. If you want to ask what openphils probabilities are and if they disagree with your own, you should just ask that directly. My impression is that there is a wide range of views on both questions among EA org leadership. 

Load more