AS

Aaron_Scher

666 karmaJoined Claremont, CA, USA

Bio

I'm Aaron, I've done Uni group organizing at the Claremont Colleges for a bit. Current cause prioritization is AI Alignment.

Comments
104

As someone reviewing applications, how do you evaluate independent research produced outside of a fellowship or academic context?

I'm not sure how to answer this. I try to evaluate all work based on its quality; whether a project was single-authored also matters a fair amount (and high quality single-author work is an especially strong signal).

Is there a threshold of rigor or novelty below which it hurts more than it helps to include it?

Maybe for rigor, probably not for novelty. Applicants and other researchers should of course be up front about what are their contributions. 

And do you have suggestions for how early-career people in this transition can get lightweight feedback on research directions before investing weeks into a project?

Weeks sound like it might be a lot. I encourage people to do Apart Research Sprints or other Hackathon-style things which are shorter. I'm not really sure about getting lightweight feedback. In my experience, when junior people ask me for feedback on a project idea, the project idea is usually too broad or vague for me to know if it's a good project, and they have usually put less than 30 min of effort into it. So maybe my advice there is something like "if you are going to ask a more established researcher for feedback on your project plan, you should have already put a couple hours into the project, including surveying the relevant literature, coming up with a detailed project plan, and doing a little bit of de-risking". I'm not sure, maybe that's more intense than I endorse. Fortunately, even without the goal of getting feedback from somebody else, these are useful steps to begin a project with. 

I will also note that prior work does not always have to be extremely relevant. Academia exists and is by far the place where the most people learn research skills. 

A couple of my thoughts, written quickly:

  • Sorry, it's a bummer to be scooped or even just feel scooped.
  • Most academic fields have, I think, less cross-org coordination than AI governance. I would be hesitant about trying to do generally more cross-org coordination in this space given that it's a departure from (what I view as) the norm in other fields.
  • As I am reading applications to hire AI governance researchers, one of my big questions is "has this person done relevant work before, successfully?". I don't think it would be much of a mark against that work if it was also similar to other work that was released at the same time, as long as it didn't seem like there was plagiarism and did seem like there were novel contributions.
  • Relatedly, multiple researchers teams taking independent stabs at the same question is often useful for reaching a higher quality of overall work, as they sometimes come up with different ideas/emphases/etc.
  • Some researchers have said (but I'm unsure where I land) that you almost never actually get scooped. Usually projects are a bit different in a way that is important and that you can emphasize in your output. Also you can boostrap from that work to make your project even better (but again, be clear about your original contributions vs. others'). 
  • I found over 10% of fellows did another fellowship after their fellowship. This doesn’t feel enormously efficient.

Seems plausibly fine to me. If you think about a fellowship as a form of "career transition funding + mentorship", it makes sense that this will take ~3 months (one fellowship) for some people, ~6 months (two fellowships) for others, and some either won't transition at all or will transition later. 

I only skimmed the post, but I want to say that it seems good to write posts like this, and I am surprised and slightly disheartened by the limited engagement you have gotten here and in various comments. These seem like very important topics, thanks for working on them! 

Strong upvoted but I figured I should comment as well. I agree with Ryan that the effect on chip supply and AI timelines is one of the most important dynamics, perhaps the most important. It's a bit unclear which direction it points, but I think it probably swamps everything else in its magnitude, and I was sad to see that this post doesn't discuss it. 

I don't have the time right now to find exactly which comparison I am thinking of, but I believe my thought process was basically "the rate of new people getting AI PhDs is relatively slow"; this is of course only one measure for the number of researchers. Maybe I used data similar to that here: https://www.lesswrong.com/s/FaEBwhhe3otzYKGQt/p/AtfQFj8umeyBBkkxa 

Alternatively, AI academics might be becoming more sociable – i.e. citing their friends' papers more, and collaborating more on papers. I don’t find either of the explanations particularly convincing. 

FWIW, I find this somewhat convincing. I think the collaborating on papers part seems like it could be downstream of the expectations of # of paper produced being higher. My sense is that grad students are expected to write more papers now than they used to. One way to accomplish this is to collaborate more. 

I expect if you compared data on the total number of researchers in the AI field and the number of papers, you would see the second rising a little faster than the first (I think I've seen this trend, but don't have the numbers in front of me). If these were rising at the same rate, I think it would basically indicate no change in the difficulty of ideas, because research hours would be scaling with # papers. Again, I expect the trend is actually papers rising faster than people, which would make it seem like ideas are getting easier to find. 

I think other explanations, like the norms and culture around research output expectation, collaboration, how many references you have to have, are more to blame. 

Overall I don't find the methodology presented here, of just looking at number of authors and number of references, to be particularly useful for figuring out if ideas are getting harder to find. It's definitely some evidence, but I think there's quite a few plausible explanations. 

Language models have been growing more capable even faster. But with them there is something very special about the human range of abilities, because that is the level of all the text they are trained on.

This sounds like a hypothesis that makes predictions we can go check. Did you have any particular evidence in mind? This and this come to mind, but there is plenty of other relevant stuff, and many experiments that could be quickly done for specific domains/settings. 

Note that you say "something very special" whereas my comment is actually about a stronger claim like "AI performance is likely to plateau around human level because that's where the data is". I don't dispute that there's something special here, but I think the empirical evidence about plateauing — that I'm aware of — does not strongly support that hypothesis. 

We estimate that

Point of clarification, it seems like FutureSearch is largely powered by calls to AI models. When you say "we", what do you mean? Has a human checked the entire reasoning process that led to the results you present here? 

My understanding of your main claim: If AGI is not a magic problem-solving oracle and is instead limited by needing to be unhobbled and integrated with complex infrastructure, it will be relatively safe for model weights to be available to foreign adversaries. Or at least key national security decision makers will believe that's the case. 

Please correct me if I'm wrong. My thoughts on the above:

Where is this relative safety coming from? Is it from expecting that adversaries aren't going to be able to figure out how to do unhobbling or steal the necessary secrets to do unhobbling? Is it from expecting the unhobbling and building infrastrucure around AIs to be a really hard endeavor? 

The way I'm viewing this picture, AI that can integrate all across the economy, even if that takes substantial effort, is a major threat to global stability and US dominance. 

I guess you can think about the AI-for-productive-purposes supply chain as having two components: Develop the powerful AI model (Initial development), and unhobble it / integrate it in workflows / etc. (Unhobbling/Integration). And you're arguing that the second of these will be an acceptable place to focus restrictions. My intuition says we will want restrictions on both, but more on the part that is most expensive or excludable (e.g., AI chips being concentrated is a point for initial development). It's not clear to me what the cost of both supply chain steps is: Currently, it looks like pre-training costs are higher than fine-tuning costs (point for initial development); but actually integrating AIs across the economy seems very expensive to do, the economy is really big (point for unhobbling/integration) (this depends a lot on the systems at the time and how easy they are to work with). 

Load more