In a preprint from October 13, two researchers from the Ruhr University Bochum and the University of Bonn in Germany found that while leading AI companies say they will design their most general-purpose AI, often called AGI, based on the most stringent safety principles—adapted from fields like nuclear engineering—the safety techniques they apply do not satisfy them.

In particular, the authors note that existing proposals fail to satisfy the principle known as defense in depth, which calls for the application of multiple, redundant, and independent safety mechanisms. The conventional safety methods that companies are known to apply are not independent; in certain problematic scenarios, which are relatively easy to foresee, they all tend to fail simultaneously. 

Many leading AI companies, including Anthropic, Microsoft, and OpenAI have all published safety documents that explicitly mention their intention to implement defense in depth for the design of their most advanced AI systems. 

In an interview with Foom, the first co-author of the study, Leonard Dung of the Ruhr University Bochum, said that it was not surprising that many of the methods for designing AI systems to be safe might fail. Research on making powerful AI systems safe is broadly viewed to be at an early stage of maturity.

More surprising for Dung, and also concerning, was that it was him and his co-author, who are academic scholars in philosophy and machine learning, to make what is arguably a foundational contribution to the safety literature of a new branch of industrial engineering.

"There has not been much systematic thinking about what exactly does it mean to take a defense-in-depth approach to safety," said Dung. "The sort of basic way of thinking about risk that you would expect these companies—and policymakers who regulate these companies—to implement has not been implemented."

Continue reading at foommagazine.orga new science journalism website and project, supported by a grant from the EA Funds Long Term Future Fund. 

1

0
0

Reactions

0
0
Comments
No comments on this post yet.
Be the first to respond.
Curated and popular this week
Relevant opportunities