203 karmaJoined


Sorted by New


I think this is an extremely good post laying out why the public discussion on this topic might seem confusing:


It might be somewhat hard to follow, but this little prediction market is interesting (wouldn't take the numbers too seriously):

In December of last year it seemed plausible to many people online that by now, August 2023, the world would be a very strange, near-apocalyptic place full of inscrutable alien intelligences. Obviously, this is totally wrong. So it could be worth comparing others' "vibes" here to your own thought process to see if you're overestimating the rate of progress.

Paying for GPT-4 if you have the budget may also be helpful to calibrate. It's magical, but you run into embarrassing failures pretty quickly, which most commentators tend to talk about rarely.

I thought this was a great point.

There is absolutely nothing hypocritical about an AI researcher who is pursuing either research that’s not on the path to AGI or alignment research to be sounding the alarm about the risks of AGI. Consider if we had one word for “energy researcher” which included all of: a) studying the energy released in chemical reactions, b) developing solar panels, and c) developing methods for fossil fuel extraction. In such a situation, it would not be hypocritical for someone from a) or b) to voice concerns about how c) was leading to climate change — even though they would be an “energy researcher” expressing concerns about “energy research.”

Probably the majority of "AI researchers" are in this position. It's an extremely broad field. Someone can come up with a new probabilistic programming language for Bayesian statistics, or prove some abstruse separation of two classes of MDPs, and wind up publishing at the same conference as the people trying to hook up a giant LLM to real-world actuators.

Right now it seems like AI safety is more of a scene (centered around the Bay Area) than a research community. If you want to attract great scientists and mathematicians (or even mediocre scientists and mathematicians), something even more basic than coming up with good "nerd-sniping" problems is changing this. There are many people who could do good work in technical AI safety, but would not fit in socially with EAs and rationalists.

I'm sympathetic to arguments that formal prepublication peer review is a waste of time. However, I think formalizing and writing up ideas to academic standards, such that they could be submitted for peer review, is definitely a very good use of time, it's not even that hard, and there should be more of it. This would be one step towards making a more bland, boring, professionalized research community where a wider variety of people might want to be involved.

I think a better analogy than "ICBM engineering" might be "all of aeronautical engineering and also some physicists studying fluid dynamics". If you were an anti-nuclear protester and you went and yelled at an engineer who runs wind tunnel simulations to design cars, they would see this as strange and unfair. This is true even though there might be some dual use where aerodynamics simulations are also important for designing nuclear missiles.

I think this would be counterproductive. Most ML researchers don't think the research they personally work on is dangerous, or could be dangerous, or contributes to a research direction that could be dangerous, and most of them actually are right about this. There is all kinds of stuff people work on that's not on the critical path to dangerous AI.

Some basic knowledge of (relatively) old-school probabilistic graphical models, along with basic understanding of variational inference. Not that graphical models are going to be used directly for any SOTA models any more, but the mathematical formalism is still very useful.

For example, understanding how inference on a graphical model works motivates the control-as-inference perspective on reinforcement learning. This is useful for understanding things like decision transformers, or this post on how to interpret RLHF on language models.

It would also be essential background to understand the causal incentives research agenda.

So the same tools come up in two very different places, which I think makes a case for their usefulness.

This is in some sense math-heavy, and some of the concepts are pretty dense, but without many mathematical prerequisites. You have to understand basic probability (how expected values and log likelihoods work, mental comfort going between  and  notation), basic calculus (like "set the derivative = 0 to maximize"), and be comfortable algebraically manipulating sums and products.

Each chapter of Russell & Norvig's textbook "Artificial Intelligence: A Modern Approach" ends with historical notes. These are probably sparser than you want, but they are good and cover a very broad array of topics. The 4th edition of the book is decently up to date (for the time being!).

Trying to "do as the virtuous agent would do" (or maybe "do things for the sake of being a good person") seems to be a  really common problem for people.

Ruthless consequentialist reasoning totally short-circuits this, which I think is a large part of its appeal. You can be sitting around in this paralyzed fog, agonizing over whether you're "really" good or merely trying to fake being good for subconscious selfish reasons, feeling guilty for not being eudaimonic enough -- and then somebody comes along and says "stop worrying and get up and buy some bednets", and you're free.

I'm not philosophically sophisticated enough to have views on metaethics, but it does seem sometimes that the main value of ethical theories is therapeutic, so different contradictory ethical theories could be best for different people and at different times of life.

I would be inclined to replace “not thinking carefully” with “not thinking formally”. In real life everything tends to have exceptions and this is most people’s presumption, so they don’t feel a need to reserve language for the truly universal claims which are never meaningful.

Some people have practice in thinking about formal systems, where truly universal statements are meaningful, and where using different language to draw fine distinctions is important (“always” vs “with probability 1” vs “with high probability” vs “likely”).

Trying to push the second group’s norms on the first group might be tough even if perhaps it would be good.

I think when most people say “unequivocally” and “all”, they almost always mean “still maybe some exceptions” and “almost all”. If you don’t need to make mathematical/logical statements, which most people don’t, then reserving these words to act as universal quantifiers is not very useful. I used to be annoyed by this but I’ve learned to accept it.

Load more