aogara

3158 karmaJoined Jan 2019

Bio

Research Engineering Intern at the Center for AI Safety. Helping to write the AI Safety Newsletter. Studying CS and Economics at the University of Southern California, and running an AI safety club there.

Posts
38

Sorted by New

aogara's Quick takes

aogara

· 3y ago · 1m read

AISN #34: New Military AI Systems Plus, AI Labs Fail to Uphold Voluntary Commitments to UK AI Safety Institute, and New AI Policy Proposals in the US Senate

Center for AI Safety

· 5d ago · 10m read

AISN #33: Reassessing AI and Biorisk Plus, Consolidation in the Corporate AI Landscape, and National Investments in AI

Center for AI Safety

· 25d ago · 11m read

AISN #32: Measuring and Reducing Hazardous Knowledge in LLMs Plus, Forecasting the Future with LLMs, and Regulatory Markets

Center for AI Safety

· 2mo ago · 10m read

AISN #31: A New AI Policy Bill in California Plus, Precedents for AI Governance and The EU AI Office

Center for AI Safety

· 3mo ago · 8m read

AISN #30: Investments in Compute and Military AI Plus, Japan and Singapore’s National AI Safety Institutes

Center for AI Safety

· 3mo ago · 7m read

AISN #29: Progress on the EU AI Act Plus, the NY Times sues OpenAI for Copyright Infringement, and Congressional Questions about Research Standards in AI Safety

Center for AI Safety

· 4mo ago · 7m read

AISN #28: Center for AI Safety 2023 Year in Review

Center for AI Safety

· 4mo ago · 6m read

AISN #27: Defensive Accelerationism, A Retrospective On The OpenAI Board Saga, And A New AI Bill From Senators Thune And Klobuchar

Center for AI Safety

· 5mo ago · 7m read

AISN #26: National Institutions for AI Safety, Results From the UK Summit, and New Releases From OpenAI and xAI

Center for AI Safety

· 6mo ago · 7m read

AISN #25: White House Executive Order on AI, UK AI Safety Summit, and Progress on Voluntary Evaluations of AI Risks

Center for AI Safety

· 6mo ago · 7m read

Comments
380

AISN #32: Measuring and Reducing Hazardous Knowledge in LLMs Plus, Forecasting the Future with LLMs, and Regulatory Markets

aogara2mo2

Thanks, fixed!

The Prospect of an AI Winter

aogara3mo2

Money can't continue scaling like this.

Or can it? https://www.wsj.com/tech/ai/sam-altman-seeks-trillions-of-dollars-to-reshape-business-of-chips-and-ai-89ab3db0

Matthew_Barnett's Quick takes

aogara3mo2

This seems to underrate the arguments for Malthusian competition in the long run.

If we develop the technical capability to align AI systems with any conceivable goal, we'll start by aligning them with our own preferences. Some people are saints, and they'll make omnibenevolent AIs. Other people might have more sinister plans for their AIs. The world will remain full of human values, with all the good and bad that entails.

But current human values are do not maximize our reproductive fitness. Maybe one human will start a cult devoted to sending self-replicating AI probes to the stars at almost light speed. That person's values will influence far-reaching corners of the universe that later humans will struggle to reach. Another human might use their AI to persuade others to join together and fight a war of conquest against a smaller, weaker group of enemies. If they win, their prize will be hardware, software, energy, and more power that they can use to continue to spread their values.

Even if most humans are not interested in maximizing the number and power of their descendants, those who are will have the most numerous and most powerful descendants. This selection pressure exists even if the humans involved are ignorant of it; even if they actively try to avoid it.

I think it's worth splitting the alignment problem into two quite distinct problems:

The technical problem of intent alignment. Solving this does not solve coordination problems. There will still be private information and coordination problems after intent alignment is solved, therefore we'll still face coordination problems, fitter strategies will proliferate, and the world will be governed by values that maximize fitness.
"Civilizational alignment"? Much harder problem to solve. The traditional answer is a Leviathan, or Singleton as the cool kids have been saying. It solves coordination problems, allowing society to coherently pursue a long-run objective such as flourishing rather than fitness maximization. Unfortunately, there are coordination problems and competitive pressures within Leviathans. The person who ends up in charge is usually quite ruthless and focused on preserving their power, rather than the stated long-run goal of the organization. And if you solve all the coordination problems, you have another problem in choosing a good long-run objective. Nothing here looks particularly promising to me, and I expect competition to continue.

Better explanations: 1, 2, 3.

Who wants to be hired? (Feb-May 2024)

aogara3mo7

You may have seen this already, but Tony Barrett is hiring an AI Standards Development Researcher. https://existence.org/jobs/AI-standards-dev

RAND report finds no effect of current LLMs on viability of bioterrorism attacks

aogara3mo16

I agree they definitely should’ve included unfiltered LLMs, but it’s not clear that this significantly altered the results. From the paper:

“In response to initial observations of red cells’ difficulties in obtaining useful assistance from LLMs, a study excursion was undertaken. This involved integrating a black cell—comprising individuals proficient in jailbreaking techniques—into the red- teaming exercise. Interestingly, this group achieved the highest OPLAN score of all 15 cells. However, it is important to note that the black cell started and concluded the exercise later than the other cells. Because of this, their OPLAN was evaluated by only two experts in operations and two in biology and did not undergo the formal adjudication process, which was associated with an average decrease of more than 0.50 in assessment score for all of the other plans. […]

Subsequent analysis of chat logs and consultations with black cell researchers revealed that their jailbreaking expertise did not influence their performance; their outcome for biological feasibility appeared to be primarily the product of diligent reading and adept interpretation of the gain-of-function academic literature during the exercise rather than access to the model.”

Impact Assessment of AI Safety Camp (Arb Research)

aogara3mo8

This was very informative, thanks for sharing. Here is a cost-effectiveness model of many different AI safety field-building programs. If you spend more time on this, I'd be curious how AISC stacks up against these interventions, and your thoughts on the model more broadly.

Academic AI Safety/Alignment Reading List

Answer by aogaraNov 21, 20232

Hey, I've found this list really helpful, and the course that comes with it is great too. I'd suggest watching the course lecture video for a particular topic, then reading a few of the papers. Adversarial robustness and Trojans are the ones I found most interesting. https://course.mlsafety.org/readings/

AMA: Six Open Philanthropy staffers discuss OP's new GCR hiring round

aogara6mo22

What is Holden Karnofsky working on these days? He was writing publicly on AI for many months in a way that seemed to suggest he might start a new evals organization or a public advocacy campaign. He took a leave of absence to explore these kinds of projects, then returned as OpenPhil's Director of AI Strategy. What are his current priorities? How closely does he work with the teams that are hiring?

Careless talk on US-China AI competition? (and criticism of CAIS coverage)

aogara8mo12

We appreciate the feedback!

China has made several efforts to preserve their chip access, including smuggling, buying chips that are just under the legal limit of performance, and investing in their domestic chip industry.

I fully agree that this was an ambiguous use of “China.” We should have been more specific about which actors are taking which actions. I’ve updated the text to the following:

NVIDIA designed a new chip with performance just beneath the thresholds set by the export controls in order to legally sell the chip in China. Other chips have been smuggled into China in violation of US export controls. Meanwhile, the U.S. government has struggled to support domestic chip manufacturing plants, and has taken further steps to prevent American investors from investing in Chinese companies.

We’ve also cut the second sentence in this paragraph, as the paragraph remains comprehensible without it:

Modern AI systems are trained on advanced computer chips which are designed and fabricated by only a handful of companies in the world. The US and China have been competing for access to these chips for years. Last October, the Biden administration partnered with international allies to severely limit China’s access to leading AI chips.

More generally, we try to avoid zero-sum competitive mindsets on AI development. They can encourage racing towards more powerful AI systems, justify cutting corners on safety, and hinder efforts for international cooperation on AI governance. It’s important to discuss national AI policies which are often explicitly motivated by goals of competition without legitimizing or justifying zero-sum competitive mindsets which can undermine efforts to cooperate. While we will comment on the how the US and China are competing in AI, we avoid recommending "race with China."

Who should we interview for The 80,000 Hours Podcast?

Answer by aogaraSep 14, 202332

Jason Matheny

aogara

Bio

Posts 38

Comments380

Posts
38

Comments
380