Proactive AI Control: A Case for Battery-Dependent Systems

JesperLindholm

Cross-posted from LessWrong

Epistemic effort: ~26 hours, multiple drafts

Epistemic status: Moderately confident in the framing and arguments.

What if we just... put frontier AI systems on batteries? This elementary idea might be more powerful than it appears.

Introduction – A Proactive Control Philosophy

Coming into the AI safety field anew, after only briefly dipping my toes in it over a decade ago, (I received my introduction through LessWrong blog posts and Max Tegmark’s Life 3.0,) one thing in particular strikes me: People are still focusing on reactive control solutions.

Like, here we are, still talking about kill switches and virtual tripwires, instead of proactive, fundamental safety design. Proactive hardware solutions in particular seem to yield relatively little excitement. This seems odd, though few AI safety people are hardware experts. Validated, existing work such as Ross Anderson’s Security Engineering and Charles Perrow's Normal Accidents would help mitigate this fact*.

*Disclosure: I have only very briefly engaged with this material myself. I discovered these seminal works because I already resonate- and agree with the core ideas brought forth by the authors. But it is also very much possible to first read them, and then get inspired. I recommend at least skimming.

For the record, I have noticed my confusion and looked for answers. One explanation is fairly obvious: Since developing capability is currently top priority in the industry, there is strong resistance against implementing inherently limiting measures that slow down progress.

For example, in 2023 Stuart Russel, a well-respected thought-leader in the field, advocated in the US senate for a complete reversal of current computer security paradigms. To paraphrase, he said that “this means switching from (A) machines that run anything unless it's known to be malicious to (B) machines that run nothing unless it's known to be safe". Such a foundational – and costly – shift in how the industry operates, is not something industry leaders would seriously consider even today.

However, as danger becomes more obvious, sentiments may shift. Furthermore, and more importantly, some proactive measures would not necessarily impede capability research much at all, if managed well. I would argue, that batteries could be one such measure. In fact, it may actually be a very good one for mitigating certain threat vectors (to be discussed more later).

The idea also achieves, at least in some application cases, several of the key principles which I value when it comes to AI safety. They are:

Human in the loop
Dead man’s switch mechanism (automatic fallback safety, vs. reactive kill switches*)
Inherent to architecture
Hardware-based implementation
For research- and adoption purposes, also
Reversibility
Low systemic damage

*This aligns with established fail-safe design principles from nuclear safety and aviation, where systems default to safe states rather than requiring active intervention.

With this in mind, the battery idea seems so obvious and elegant that I feel it deserves some attention. To be clear, this is defense in depth, not a comprehensive control solution. And please note, I will not discuss AI alignment here, only control ideas.

That said, the battery idea seems to be relatively low-hanging fruit, with applications that potentially could be implemented rapidly. Therefore, I will follow Cunningham’s law and offer some of my thoughts on the matter. And if I am very wrong in important ways, I will be glad I did not attempt to draft something akin to a full-on research article right away.

Now, without further ado, here is…

AI on Batteries

Imagine if we could have all frontier AI models run on batteries. The safety features seem immediate and obvious. Here are some of my favorites:

Hard runtime caps - even the largest battery packs provide finite energy, creating natural session limits
Performance degradation curves - as batteries drain, computational capacity automatically reduces
Predictable shutdown points - complete power loss provides guaranteed halt conditions
Physical charging bottlenecks – AI can't simply "request more power" - it needs human intervention to recharge
Improving isolation/sandboxing opportunities - easier to physically secure battery-powered systems vs. grid-connected ones

Depending on application, there are many others, even though specific applications also come with specific drawbacks. Either way, the benefits are: limited inherent capability, strategic predictability (the periodic charging requirements also serve as natural checkpoints for safety reviews), and automatic reset points - if batteries are left to expire.

This also has the psychological benefit of making AI feel more like a "device" rather than an omnipresent entity.

The threat vectors it combats are several, but they do depend on the application. They are:
a) Runaway capability & rogue behaviour during training or in new environments. This would mean that the system suddenly starts acting in unexpected ways, potentially even spiraling out of control. If the AI is running off-grid, it will eventually stop before it can cause too much damage.

b) Resource exhaustion attacks are similarly limited. Resource exhaustion attack means that an AI tries to consume unlimited resources to scale itself, either by naivety or hostility. If the AI is dependent on batteries for scaling, this is not very doable.

c) Insider negligence. Batteries mitigate operators and researchers as safety bottlenecks if they forget/neglect safety protocols or lose control of their test runs.

d) Gradual capability creep is indirectly prevented via regular charging/restart cycles, that create natural evaluation checkpoints.

e) Strategic unpredictability. Grid-run AI models require constant vigilance and can “strike” at any moment. Defined recharge times create better strategic predictability.

Some vectors left weak/unimproved are:

Most adversarial attacks
Misalignment & Deception
SI escape/proliferation scenarios
- An AI with sufficient software access can copy itself elsewhere
- An SI can likely manipulate an operator to let it escape anyway

Still, all in all the idea seems very promising. So, what’s the catch?

The catch

Famously, current frontier models and related infrastructure require A LOT more power than what is available with current high-end batteries. GPT-3 175B (2020) was likely trained on something like 1,287 MWh ≈ 38 MWh/day ≈ 1.6 MW continuous draw. BLOOM-176B (2022) was trained on approximately 433 MWh. PaLM 540B (2022) can be estimated to have required well over 2000 MWh.

In comparison, datacenter UPS systems use something like 0.5-1.5 MWh. Tesla megapack 2 XL gives circa 3.9 MWh (lasts for 4h). Large ferry marine ESS gives 10 MWh. Constantly buying enough batteries would be quite cumbersome and very expensive. It would also not be very environmentally friendly.

Furthermore, even if we would soon have batteries powerful enough for existing models, they may scale poorly for systems that need substantial compute. A battery-paradigm might trigger development of more efficient algorithms, rather than actually constraining capability well. (That would stop exacerbating the unfolding global climate disaster though).

All of that said, we should not give up on the battery idea so lightly.

First of all, these limitations do not apply to small, experimental models or weak pocket-clones of frontier systems. Secondly, by vastly limiting the number of users, AI models who are already trained would consume much less power than the frontier models currently do.

But even for existing frontier models, and likely to-be future ones, there are at least two modifications we can attempt. Before we do that though, let’s look closer at company incentives, to consider why someone would consider them in the first place.

Industry incentives

So far I have briefly discussed both some strategic and technical aspects of the battery idea. But why would companies be willing to consider any of these ideas to begin with, if the core idea is limiting capability? Especially when there are real costs involved?

Well, one of the key incentives of leading companies, even if they are not inherently safety minded, is to stay on top of regulation. If they achieve actual safety at the same time, without impeding progress too much, great. If the price is relatively low, perfect.

With this in mind, let’s look at some of the key features of the battery idea:

Checks many key safety principles by design
Is possible with existing technology
Could be relatively fast to implement
Doesn’t inherently need to interfere with capability, as long as "everything works", meaning no downtime due to flawed design, or poor adherence to operational protocols

A solution with all these features may seem attractive to companies who want to look good to regulators. Put like this, why wouldn’t companies think about this, at least for a second? They may not show excitement outwards (standard non-commitment), but they may have a think internally.

The biggest challenge right now is usually cost. But big companies have big budgets, and their teams do have allocated budgets that they need to use. And novel ideas can snowball. There is merit in remembering this.

Competition dynamics

Right now, the major AI safety challenge is that frontier labs are locked in racing dynamics. But when it comes to safety, I am hopeful that this can change fast in significant ways. With the help of safety thinkers, funders, and coordinators, I believe this dynamic can shift.

I also believe that stability, and by extension safety, is inherently a competitive edge. Long-term it definitely is, and staying on top of regulation certainly is. As regulation always follows innovation, there is a need for leaders to anticipate future demands. Talking about the AI industry as a whole: if readily implementable solutions have any real value, they should be explored. There is always a price attached to missed opportunity.

Consider the first mover issue. Few companies want to sacrifice their edge by being the first to adopt a new safety measure. But if others do and validate the idea, they also don’t want to be the last ones to miss out. Some industry leaders may even anticipate these things by quietly acquiring IP pre-emptively.

Sometimes ideas just need to be put out very clearly in front of people in the first place, in order to be worked out and eventually adopted.

Speaking from personal experience at one of the world’s largest corporations, I can vouch for this. My work regularly requires me to collaborate across continents. I have seen how ideas can spread and take hold across real and perceived borders. Even if the current draft application is immature, and even if key players are against it for political reasons, it can succeed if it has long-term value. Persistent, patient lobbying and skilled coordination are key.

In truth, ideas morph and merge, often in non-transparent ways, until suddenly whole new infrastructure is adopted. The expectation is that it will one day be profitable. Sometimes, though not always, that’s how technological adoption unfolds at the frontier.

Frontier model modifications

As promised, I will offer two suggestions to how the idea of batteries could be utilized in large, cutting-edge AI models. They are:

Hybrid power source (chargeable power solutions)
Enforce battery reliance on critical components only, not the whole system

Hybrid solutions

The first idea is focused on combining grid-power with chargeable systems. It would need significant sophistication to work. UPS systems are already used for backup purposes, but they can only offer 5-14 minutes of battery backup during grid transitions.

To build a battery park capable of fully powering a lab like OpenAI would be insanely expensive. It would require sizeable infrastructure comparable with that used for entire cities. For example, Tesla's Hornsdale Power Reserve in Australia provides 150 MW for up to 1 hour, and costs hundreds of millions of dollars. But frontier labs may need 50+ megawatts available 24/7.

What we could do instead, is continue to rely on grid power, but extend the reliance on complementary batteries. In other words, we create a strategic vulnerability. AI systems are sensitive to power fluctuations, so this is technically doable. The idea would be to use batteries for load balancing purposes, in a way that makes them indispensable, and a critical point of failure. If load balancing isn't regularly recharged, the entire system becomes unstable.

Some major weaknesses of this idea are obvious: We may experience unintended failures and the whole infrastructure becomes more vulnerable to external attacks. On the other hand, you can shut down a whole site like this, not just an isolated model. Remove the batteries, and watch the empire crumble.

Batteries for critical components only

We could also consider making sub-system components dependent on batteries. If so, we may need custom hardware. We should focus on single subsystems, and reversibility, meaning avoiding having to insert many batteries into many identical components (every tokenizer, every SRAM, every attention controller etc.) and avoiding permanent damage if battery runs out.

Suggestions:

API-Level Battery Dependencies

Good for deploying models

Critical API Endpoints:

Authentication APIs: Battery-powered auth servers – no auth, no access
Model Query APIs: The core inference request handlers
Rate Limiting APIs: Request throttling and quota management

APIs use far less power than the engines they relate to. A laptop-sized battery could do the job, and there would be no interference to the main AI model which stays fully powered and functional. Overall, this approach creates clean failure modes that don't confuse users or create dangerous edge cases. It's essentially making AI access, rather than AI computation, battery-dependent.

The cost of such an implementation is probably relatively low, but far from trivial. The costs heavily depend on how often the batteries are recharged and where the API-control sits geographically. For smaller models or niche uses, the cost could probably be reduced to trivial levels, which would remove the resistance from labs to implement it, even if it only partially increases safety. If it is basically cost-free, why not?

Centralized Token Processing Battery Architecture

Good for development and testing scenarios

Upfront disclosure: When all is said and done, I do not think that a Centralized Token Architecture is a very realistic target for batteries. But it is an illustrative one, both in terms of what makes a good strategic target, and what doesn’t.

How it works: A majority of tokenization/detokenization units (or all of them) would draw from one shared battery system. The battery could be substantial; think car battery or UPS-sized. But it would be separated from the main power.

In a centrally located system, with dedicated hardware, this is a perfect, non-destructible dead man’s switch.

Why Token Processing is a Good Target:

Absolutely critical for current systems:
- Without tokenization, the model can't understand inputs
- Without detokenization, the model can't produce coherent outputs
Clean failure mode: System becomes completely non-functional, not partially broken
Relatively low power: Token processing is computationally light-weight

Technical Implementation

To do this, one could use deep-cycle batteries (12V, 100-200Ah). They last days or weeks depending on usage. The system would then require regular charging, but not constant grid connection. This is much cheaper than powering entire inference clusters.

An ambitious hardware design could utilize custom ASIC chips for token processing with dedicated power rails. Battery management system would include charge level monitoring, offering graceful degradation. System gives warnings as battery depletes.

This is a clean shutdown. The model becomes completely non-functional, not merely confused. We avoid dangerous "zombie" states where model half-works.

This is also much simpler than distributed battery systems. The setup is easy to monitor and maintain centrally. It creates clear operational requirements. Most importantly: it prevents indefinite autonomous operation.

Security and Control

Physical access will be required for battery maintenance. This can't be easily bypassed remotely. The setup creates natural "check-in" requirements for operators. The setup is also harder to run covertly without regular maintenance.

Perfect Reversibility

No Permanent Damage. Token processing units would simply power down, not break. All model weights and parameters remain intact in main memory/storage. System architecture remains unchanged, it is just a power rail that goes offline. By replacing/recharging the battery, you immediately restore full functionality.

The AI model itself suffers no degradation; there is no data corruption or parameter drift. Training doesn't need to be repeated and all software and configurations remain unchanged.

Cost-Effectiveness (the flip side)

Token processing is a small fraction of total compute power. The battery system would be orders of magnitude cheaper than power sources powering full inference (the main computing done during deployment), so maintenance costs are relatively manageable. But still expensive.

I am not an expert, but I assume additional costs of hundreds of thousands of dollars for frontier labs. Making full-scale hardware changes to very expensive systems is costly. And that is without first designing overly customized hardware. Implementation costs of fully custom hardware (think custom chips,) are likely prohibitively expensive, even if this solution would guarantee safety across most threat vectors. Which it does not.

Why Tokenization is not a Great target:

The core issue is that tokenization is ultimately just computation, and computation tends to be fungible in modern systems. In principle, any hardware could run software tokenization. This makes the approach weak to adversarial attacks. Furthermore, modern AI systems also run across multiple datacenters, using cloud solutions. This makes the idea less implementable ad hoc.

One mitigation is to make tokenization cryptographically tied to battery-powered hardware (tokens signed by secure hardware) and/or to integrate battery dependency deeper into the model architecture itself. But this increases complexity and generates new weak points. Because of these challenges, this measure is not necessarily easily implementable in existing frontier labs’ setup. It may however be quite doable if a lab starts over and develops a model architecture de novo.

Additional suggestions

Without spending too much time on these, here are some additional suggestions that now should make more sense.

DeepMind’s Tensor Processing Units

Google is currently developing its own custom Tensor Processing Units. Adjusting these to fit battery dependency, by targeting inter-chip communication, could be cheaper than targeting tokenization, but have similar impact.

Although the costs are still high in real terms, as a percentage of the actual costs of developing and implementing TPUs, (counted in hundreds of millions to billions,) they are minor. If the safety advantage is substantial, and safety focus becomes critical, this could turn into a viable target.

Clocks

Modern computing hardware depends on high-fidelity clocks. Built-in clocks in processing units are absolutely necessary for their function. Due to this, there is high redundancy of clocks in computing infrastructure, making this a less attractive target. Without targeted engineering, there are also several bypass mechanisms available. However, clocks are also small, simple and cheap. This makes at least some clocks conceptually feasible targets.

Building on the section above, my suggestion - based on some admittedly brief research - would be to focus on interconnect clocks (such as found in the 4th generation NVIDIA interconnect system H100 NVLink). These are critical for multi-GPU coordination. They are the components that enable multiple GPUs to scale, by providing exact synchronization of data transfer.

Shutting down even a few of these could significantly reduce capacity, by effectively reducing a multi-GPU unit's capability to that of a single GPU unit. This will not stop the AI altogether, or even necessarily remove any of its operations, but it will reduce its actual thinking power. An industry actor like OpenAI may favour this “smooth” approach, for example when testing and releasing new models intended for long-time use. In dangerous edge cases the AI loses its edge and effectively gets dumber, until battery is replaced. But it remains operational throughout.

Wrapping up

Putting batteries in AIs seems simple on the surface. It is somewhat more complicated when you start thinking about it technically and logistically. It is also a bit more expensive than it seems, just like absolutely everything else that involves high-tech powered AI models. That said, it remains a defense in depth measure that can actually be implemented, and tested, today. If there is the will to do so.

The idea of a battery-run AI also fulfills several key safety principles. While it doesn’t combat all threat vectors, it could be quite effective against some threats that do exist today. It also highlights the idea of deploying dead switch solutions over mere kill switch solutions, something which I am in particular favour of.

It is also possible to combine and stack these ideas with additional solutions like kill switches, decentralized control, and more, in various tactical and strategic ways. Such ideas are worth exploring in separate texts.

The big idea is to view battery applications as defensive tools in a diverse toolkit, rather than a fix-all solution. As emphasized in cybersecurity (e.g., Bruce Schneier) and AI safety literature (Amodei et al.'s concrete problems), no single measure suffices - but this adds another layer to existing proposals.

To put it into metaphor: For the threats that currently exist in the real world, you don’t need to look for a silver bullet if you have enough regular firepower. The werewolf doesn’t actually exist yet.

...But if it eventually will exist, there is also no harm done in forging the gun, while looking for the silver ore.

Beliefs shape decisions, actions and outcomes

The single greatest benefit to raise the idea of batteries may turn out to be psychological. The idea of putting batteries into AIs remind us that we are ultimately in control – for now. We can build inherently safer AI today, if we want to.

We can also stop feeding AI models data and energy, once they are built. Consider once again how this relates to limited power sources: If we don’t power our little monsters, they won’t stay alive. Unlike Frankenstein’s monster, our AI beasts are currently totally dependent on us and our resources. A single lightning strike won’t be enough for them to wake up and walk away.

Let’s try and keep it that way.

Disclaimer – Scope & suggested reading:

This analysis focuses on practical implementation rather than comprehensive literature review. I wrote this piece relatively quickly, in isolation from other work. Therefore I am omitting suggested further reading beyond sources listed in-text. However, readers interested in related work on AI control theory, hardware security, and power systems management may find valuable connections to explore.

This was a serious attempt to introduce the idea, but this is not a research article. Please note that I am not a hardware engineer nor an expert on physical control solutions. I am, at this point, just a contributor of ideas. For those interested in exploring the concept further, much more rigorous analysis is needed in the following areas: threat vectors, implementation costs, and deployment pathways.

On the hopeful side: I fully expect superior application ideas to arise when someone sits down and spends more than a few days seriously thinking about this.

EA Forum Bot Site
EA Forum