Parker_Whitfill

When we do that, we'll also upload full replication files. But I don't want to keep anyone waiting for the data in case they have some uses for it, so see here for the main CSV we used: https://github.com/parkerwhitfill/EOS_AI

Estimating the Substitutability between Compute and Cognitive Labor in AI Research

Parker_Whitfill7mo4

On 2), the condition you find makes sense, but aren't you implicitly assuming an elasticity of substitution of 1 with Cobb-Douglas?

Yes, definitely. In general, I don't have a great idea about what looks like. The Cobb-Douglas case is just an example.

Estimating the Substitutability between Compute and Cognitive Labor in AI Research

Parker_Whitfill7mo3

Yep. We are treating as homogenous (no differentiation in skill, speed, etc.) I'm interested in thinking about quality differentiation a bit more.

Estimating the Substitutability between Compute and Cognitive Labor in AI Research

Parker_Whitfill7mo3

In complete generality, you could write effective labor as

That is, effective labor is some function of the number of human researchers we have, the effective inference compute we have (quantity of AIs we can run) and the effective training compute (quality of AIs we trained).

The perfect substitution claim is that once training compute is sufficiently high, then eventually we can spend the inference compute on running some AI that substitutes for human researchers. Mathematically, for some $x$ ,

$Z (H, A K_{i n f}, x) = H + \frac{A K_{i n f}}{c}$

where $c$ is the compute cost to run the system.

So you could think of our analysis as saying, once we have an AI that perfectly substitutes for AI researchers, what happens next?

Now of course, you might expect substantial recursive self-improvement even with an AI system that doesn't perfectly substitute for AI labor. I think this is a super interesting and important question. I'm trying to think more about this question, but its hard to make progress because its unclear what $Z (\cdot)$ looks like. But let me try to gesture at a few things. Let's fix $x$ at some sub-human level

At the very least, you need some function that goes to infinity as A goes to infinity. For example, if there are certain tasks which must be done in AI research and these tasks can only be done by humans, then these tasks will always bottleneck progress.
If you assume say Cobb-Douglas, i.e.
$Z (H, A K_{i n f}, x) = H^{α (x)} {\frac{A K_{i n f}}{c}}^{1 - a (x)}$
where $1 - α (x)$ denotes the share of labor tasks that AI can do, then you'll pick up another $1 - α (x)$ in the explosion condition i.e. $ϕ + λ > 1$ will become $ϕ + (1 - α (x)) λ > 1.$ This captures the intuition that as the fraction of tasks an AI can do increases, the explosion condition gets easier and easier to hit.

Estimating the Substitutability between Compute and Cognitive Labor in AI Research

Parker_Whitfill7mo*4

Here is a fleshed out version of Cheryl's response. Lets suppose actual research capital is but we just used $K$ in our estimation equation.

Then the true estimation equation is

$ln \frac{q K}{L} = σ ln \frac{γ}{1 - γ} + σ ln \frac{w}{r}$

re-arranging we get

$ln \frac{K}{L} = σ ln \frac{γ}{1 - γ} - ln q + σ ln \frac{w}{r}$

So if we regress $ln \frac{K}{L}$ on a constant and $ln \frac{w}{r}$ then the coefficient on $ln \frac{w}{r}$ is still $σ$ as long as q is independent of $w / r$ .

Nevertheless, I think this should increase your uncertainty in our estimates because there is clearly a lot going on behind the scenes that we might not fully understand - like how is research vs. training compute measured, etc.

Estimating the Substitutability between Compute and Cognitive Labor in AI Research

Parker_Whitfill7mo3

Note that if you accept this, our estimation of in the raw compute specification is wrong.

The cost-minimization problem becomes

${min}_{H, K} w H + r K s . t . F (A K, H) = ¯ F$ .

Taking FOCs and re-arranging,

$\frac{K}{H} = σ \frac{γ}{1 - γ} + σ ln \frac{w A}{r}$

So our previous estimation equation was missing an A on the relative prices. Intuitively, we understated the degree to which compute was getting cheaper. Now A is hard to observe, but let's just assume its growing exponentially with an 8 month doubling time per this Epoch paper.

Imputing this guess of A, and estimating via OLS with firm fixed effects gives us $σ = .89$ with $.10$ standard errors.

Note that this doesn't change the estimation results for the frontier experiments since the $A$ in $\frac{A K_{r e s}}{A K_{t r a i n}}$ just cancels out.

Estimating the Substitutability between Compute and Cognitive Labor in AI Research

Parker_Whitfill7mo6

I spent a bit of time thinking about this today.

Lets adopt the notation in your comment and suppose that is the same across research sectors, with common $λ$ . Let's also suppose common $σ < 1$ .

Then we get blow up in $A_{c o g}$ iff

${\begin{matrix} ϕ_{c o g} + λ > 1 & if ϕ_{c o g} \leq ϕ_{e x p} max {ϕ_{c o g}, ϕ_{e x p} + λ} > 1 & if ϕ_{c o g} > ϕ_{e x p} \end{matrix}$

The intution for this result is that when $σ < 1$ , you are bottlenecked by your slower growing sector.

If the slower growing sector is cognitive labor, then asympotically $F \propto A_{c o g}$ , and we get $˙ A \propto A_{c o g}^{ϕ_{c o g}} A_{c o g}^{λ}$ so we have blow-up iff $ϕ_{c o g} + λ > 1$ .

If the slower growing sector is experimental compute, then there are two cases. If experimental compute is blowing up on its own, then so is cogntive labor because by assumption cognitive labor is growing faster. If experimental compute is not blowing up on its own then asympotically $F \propto A_{e x p}$ and we get ${˙ A}_{c o g} \propto A_{c o g}^{ϕ_{c o g}} A_{e x p}^{λ}$ . Here we get a blow-up iff $ϕ_{c o g} > 1$ .^[1]

In contrast, if $σ > 1$ then F is approximately the fastest growing sector. You get blow-up in both sectors if either sector blows up. Therefore, you get blow-up iff $max {ϕ_{c o g} + λ, ϕ_{e x p} + λ} > 1$ .

So if you accept this framing, complements vs substitutes only matters if some sectors are blowing up but not others. If all sectors have the returns to research high enough, then we get an intelligence explosion no matter what. This is an update for me, thanks!

^{^}
I'm only analyzing blow-up conditions here. You could get e.g. double exponential growth here by having $ϕ_{c o g} = 1$ and $ϕ_{e x p} + λ = 1$ .

Estimating the Substitutability between Compute and Cognitive Labor in AI Research

Parker_Whitfill7mo1

Also, updating this would change all the intelligence explosion conditions, not just when .

Estimating the Substitutability between Compute and Cognitive Labor in AI Research

Parker_Whitfill7mo5

Yep, I think this gets the high-level dynamics driving the results right.

Parker_Whitfill

Bio

Posts 7

Comments33

Posts
7

Comments
33