W

WillPearson

117 karmaJoined

Comments
91

My expectation is that software without humans in the loop evaluating it, will Goodhart's law itself and over fit to the metrics/measures given.

My blog might be of interest to people 

Here is a blog post also written with Claudes help that I hope to engage with home scale experimenters with

I appreciate your  views on space and AI working with ML systems in that way might be useful. 

But I think that I am drawn to the base reality a lot because of threats to that from things like gamma ray bursts or aliens. These things can only be represented probabilistically in simulations because they are out of context. The branching tree explodes with possibilities. 

I agree that we aren't ready for agents , but I would like to try to build time non-static intelligence augmentation as slowly as possible. Starting with building systems to control and shape them tested out with static ML systems.  Then testing them with people. Then testing them inside simulations etc

I find your view of things interesting. A few questions, how do you deal with democracy when people might be inhabiting worlds unlike the real one and have forgotten the real one exists? 

I think static AI models lack corrigibility, humans can't give them instruction on how to change how to act, so they might be a dead end in terms of day to day usefulness. They might be good as scientists though as they can be detached from human needs. So worth exploring.

There is a concept of utility, but I'm expecting these systems to mainly be user focussed so not agents in their own rights, so the utility is based on user feedback about the system. So ideally the system would be an extension of the feedback systems within humans.

There is also karma which is separate from utility which is given by one ml system to another, if it is helped it out or hindered it in a non-economic fashion.

I've been thinking that AGI will require an freely evolving multi-agent approach. So I want to try out the multi-agent control patterns on ML models without the evolution. Which should prove them out in a less dangerous setting. The multi-agent control patterns I am thinking are things like karma and market based alignment patterns. More information on my blog 

Does anyone have recommendations for people I should be following for structural AI risk discussion and possible implications of post-current deep learning AGI systems.

I suppose I'm interested in questions around what is an existential threat. How bad a nuclear winter would it have to be to cause the collapse of society (and how easily could society be rebuilt afterwards). Both require robust models of agriculture in extreme situations and models of energy flows in economies where strategic elements might have been destroyed (to know how easy rebuilding would be). Since pandemic/climate change also have societal collapse as a threat the models needed would apply to them too (they might trigger nuclear exchange or at least loss of control over nuclear reactors, depending upon what societal collapse looks like).

The national risk register is the closest I found, in the public domain. It doesn't include things like large meteorites, that I found.

It's true that all data and algorithms are biased in some way. But I suppose the question is, is the bias from this less than what you get from human experts, who often have a pay cheque that might lead them to think in a certain way.

I'd imagine that any system would not be trusted implicitly, to start with, but would have to build up a reputation of providing useful predictions.

In terms of implementation, I'm imagining people building complex models of the world, like decision making under deep uncertainty with the AI mainly providing a user friendly interface to ask questions about the model.

Load more