Cross-posted from LessWrong
Summary: It seems at least possible that scaling AI systems (broadly construed) could create dangerously powerful agents. I consider methods to discourage groups from massively scaling AI systems with little regard for safety. Examining cultural, regulatory, and technological interventions, cultural approaches seem best suited to this goal in the near term.
For the purposes of this post, I am going to lump several things together when I talk about "scale". Some of the highly-scalable inputs to an AI system include:
- Number of parameters
- Training time
- Total memory
- Total compute
- Dataset size
- Total cost
- Number of researchers
- Total research effort
I'm not particularly concerned with the specific way that these resources can be deployed to increase AI capability, rather, it's important that there are inputs that can be increased arbitrarily in exchange for higher performance.
In other words, once the fundamentally-limited inputs to an AI system have been maxed out, further gains will be determined by the infinitely-scalable inputs. Things like "better prompting" or "clever architectures" can certainly increase capability, but at some point, the low-hanging fruit will be picked. In order to get higher performance, researchers will have to turn to scaling . What happens when we turn these dials up to 11?
The Scaling Hypothesis
The scaling hypothesis is becoming an increasingly important view in the AI safety community; some posit that "scale is all you need" to create AGI.
Personally, I'm uncertain whether scaling will allow us to create AGI; though the fact that transformer models demonstrate emergent capabilities when scaled is certainly suggestive.
Because it's at least possible that scaling existing transformer models can lead to AGI, we should do something to prepare for the worst case scenario where the hypothesis is true.
Slowing down scaling
Assuming that the Scaling Hypothesis is true, what does it mean for AI safety? Since we don't have good solutions to the alignment problem yet, it's important to slow down scaling  to provide more time for alignment research, outreach, and coordination.
Let's look at a couple of broad interventions that might slow down scaling.
The idea here is to "frown upon" groups that massively scale AI systems with little regard for safety. Cultural norms may seem like a weak method to enforce rules, but a tight-knit research community has significant power over its members. The community can punish bad actors by discouraging new researchers from joining the group, reducing collaborations, or halting the flow of tacit knowledge to the group. Researchers involved in the work might suffer reputational damage for risky work and unscrupulous companies might see lower investment. Groups with a track record for safety might see a relative increase in applicant quality and receive more support from the AI community.
Regulation could be used to limit the scale of models that companies use to train, withhold funding for risky projects, or restrict the publication of details related to massive scaling. Countries could enter international agreements limiting the scale and deployment of large AI models .
It may be possible to guide the development of AI technology via targeted technological development. For example, it may be possible to develop training paradigms which work well for small models but do not scale to larger models. Alternatively, it may be possible to create satisfactory AI that makes the development of more sophisticated models superfluous, guiding research towards smaller, safer models. Publishing specific open source software may help shape AI development towards less risky paradigms.
Which approach is best?
While I think that all 3 approaches should receive attention, cultural approaches seem most viable in the short term. For one, influencing culture is relatively cheap compared to conducting research or lobbying governments. Additionally, culture in a small research field can change quickly, much faster than it takes to change policy or to develop and deploy new technologies.
But most importantly, culture is far more adaptive than the other approaches. For example, if regulators produced a law limiting the total parameter count, researchers might switch to higher precision floating-point numbers to squeeze more performance out of the same number of parameters. It's extremely hard to craft loophole-free regulation and legislation is produced too slowly to keep up with developments in AI.
On the technology side, let's say that you invented AI accelerator hardware that can cheaply train a 10 billion parameter model, but doesn't scale well to 1 trillion parameter models. It's possible that researchers will find a way to ensemble many 10 billion parameter models to get performance equivalent to a 1 trillion parameter model. In general, it can be hard to predict how a particular technology will be used or whether it will achieve certain safety goals.
But cultural is much harder to thwart. Bad actors would have deceive an entire community of savvy researchers (potentially including their own team) and stop potential whistleblowers. This isn't impossible, but the difficulty and the costs of a bad reputation may be prohibitive.
Does slowing scaling help the bad guys?
One counterargument is that slowing scaling might only work on groups that are already concerned about AI-risk. This would give unscrupulous actors an upper hand, possibly increasing risk on net.
This is an important point which deserves further consideration. However, my initial guess is that efforts to slow scaling across the field will still slow unscrupulous actors. This is because research in different labs is complimentary; slowing scaling at DeepMind would also impede other groups since they rely on each other for insights.
That being said, a uniformly applied scaling slowdown may still create a relative advantage for risky researchers. Cultural approaches are best suited to deal with this problem. If a research group presses on in spite of warnings about the risk, the community can respond by discouraging new researchers from joining the offending group, halting collaborations, and limiting flows of tacit knowledge . This should reduce any advantage that risky groups might enjoy.
Another problem is that these techniques might be used to selectively disadvantage specific groups unrelated to their safety profile. For example, there is a long history of large companies using government regulation to raise barriers to entry in order to reduce competition. Existing AI companies could lobby for additional safety regulations in order to block new entrants. This is another reason to be hesitant about using regulations to slow AI scaling. Fortunately, it seems less likely that the other approaches can be used to gain an unfair advantage.
It's unclear whether this possibility is enough to outweigh the benefits of slowing scaling, but the design of any of these methods should minimize their potential for abuse .
In addition to direct alignment work, the AI safety community should consider how to slow down AI scaling to buy more time. Of the approaches listed here, developing cultural norms against reckless scaling is the easiest, fastest, and most adaptive solution.
Future work should specify how to build consensus amongst the broader AI community via outreach to companies, scientists, and industry leaders. Widespread cultural norms against unsafe research practices can slow bad actors, foster coordination, and slow the development of AGI.
I also implore AI researchers to frown upon massive, reckless scaling of AI systems. Public discussion of safety concerns can help to punish bad actors and establish expectations for good practices in AI research.
This is not to say that scaling is as simple as changing the number of parameters in a Python script. Continual scaling requires new techniques and increasingly specialized researchers. Steady Moore's-law-like improvements may seem automatic from the outside, but constant growth typically requires exponentially increasing resources in order to combat the loss of low-hanging fruit.
For the rest of this post, I'm going to ignore the possibility of stopping scaling entirely since it seems unrealistic. If you like, you can think of stopping scaling entirely as a specific type of slowdown. In general, I am against such pivotal acts, but that's a discussion for another time.
Though it may seem impossible to prevent risky research from occurring in private, there is some evidence that requirements of secrecy halt flows of tacit knowledge and limit the development of dangerous technologies (more on this in a future post).
In the extreme, this policy would create an independent research group, ostracized from the community (though ideally major steps would be taken to avoid this outcome). At this point, cultural incentives are unlikely to have an effect. Nothing can stop completely independent actors, but the policies suggested here can still slow them.
Regardless, attempts to slow AI scaling will probably not make the situation worse. Companies likely already use similar techniques to gain an advantage and it is unclear that independent efforts to slow scaling would give them a larger advantage. Even if these approaches were partially abused in order to gain an unfair advantage, they would still accomplish the goal of slowing down AI research.