finm

Researcher @ Longview Philanthropy
2973 karmaJoined Working (0-5 years)Oxford, UK
www.finmoorhouse.com/writing

Bio

I do research at Longview Philanthropy. Previously I was a Research scholar at FHI and assistant to Toby Ord. Philosophy at Cambridge before that.

I also do a podcast about EA called Hear This Idea.

www.finmoorhouse.com/writing

www.hearthisidea.com

Comments
154

at the sharp end of the intelligence explosion it will be able to do subjective decades of R&D before the second mover gets off the ground, even if the second mover is only hours behind

Where are you getting those numbers from? If by “subjective decades” you mean “decades of work by one smart human researcher”, then I don't think that's enough to secure it's position as a singleton.

If you mean “decades of global progress at the global tech frontier” then imagining that the first-mover can fit ~100 million human research-years into a few hours shortly after (presumably) pulling away from the second-mover in a software intelligence explosion, then I'm skeptical (for reasons I'm happy to elaborate on).

finm
14
1
0

Do you think octopuses are conscious? I do — they seem smarter than chickens, for instance. But their most recent common ancestor with vertebrates was some kind of simple Precambrian worm with a very basic nervous systems.

Either that most recent ancestor was not phenomenally conscious in the sense we have in mind, in which case consciousness arose more than once in the tree of life. Or else it was conscious, in which case consciousness would seem easy to reproduce (wire together some ~1,000 nerves).

finm
23
1
1

The main question of the debate week is: “On the margin, it is better to work on reducing the chance of our extinction than increasing the value of the future where we survive”.

Where “our” is defined in a footnote as “earth-originating intelligent life (i.e. we aren’t just talking about humans because most of the value in expected futures is probably in worlds where digital minds matter morally and are flourishing)”.

I'm interested to hear from the participants how likely they think extinction of “earth-originating intelligent life” really is this century. Note this is not the same as asking what your p(doom) is, or what likelihood you assign to existential catastrophe this century.

My own take is that literal extinction of intelligent life, as defined, is (much) less than 1% likely to happen this century, and this upper-bounds the overall scale of the “literal extinction” problem (in ITN terms). I think this partly because the definition counts AI survival as non-extinction, and I truly struggle to think of AI-induced catastrophes leaving only charred ruins, without even AI survivors. Other potential causes of extinction, like asteroid impacts, seem unlikely on their own terms. As such, I also suspect that most work directed at existential risk is just already not in practice targeting extinction as defined, though of course it is also not explicitly focusing on “better futures” instead — more like “avoiding potentially terrible global outcomes”.

(This became more a comment than a question… my question is: “thoughts?”)

Nice! Consolidating some comments I had on a draft of this piece, many of them fairly pedantic:

  • Why would value be disributed over some suitable measure of world-states in a way that can be described as a power law specifically (vs some other functional form where the most valuable states are rare)? In particular, shouldn't we think that there is a most valuable world state (or states)? So at least we need to say it's a power law distribution with a max value.
  • "Then there is a powerful argument that the expected value of the future is very low."
    • Very low as a fraction of the best futures, but not any lower relative to (e.g.) the world today, or the EV of the future. Indeed the future could be amazing by all existing measures. One framing on what you are saying is that it could be even better than we think, which is not a pessimistic result!
      • Another framing is more like “it's easier than we thought to make amazing-seeming worlds far less valuable than they seem, by making mistakes like e.g. ignoring animal farming”. That is indeed bad news.
    • And of course the decision-relevance of MPL depends on the feasibility of the best futures, not just how they're distributed.
  • One possibility that would support MPL is the possibility that value scales superlinearly with the amount of "optimised matter" — e.g. with brain size. The task of a risk-neutral classical utilitarian can then effectively be boiled down to "maximising the chance of getting ~the most optimized state possible", as long as "~the most optimized state possible" is at all feasible.
  • "If you value pretty much anything (e.g. consciousness, desire satisfaction), there’s likely to be a sharp line in phase space where a tiny change to the property makes an all-or-nothing difference to value." — this is true, but that the [arrangements of matter] → [value] function has discontinuities doesn't imply that the very most valuable states are extremely rare and far more valuable than all the others. So I think it's weak evidence for MPL.
  • Some distinctions which occur to me below. Assuming some measure over states:
    • EV of the world at a state, vs value of a state itself (where EV cares about future states)
      • Note that the [state]→[EV] function should be discontinuous, because the evolution of states over time is discontinuous, because that's how the world works! E.g. in some cases you should evaluate a big difference in EV just by changing a vote counter after an election by one.
    • Fragility of value/EV, something like how much value tends to change with small changes in space of states
    • Rarity of value, something like what fraction of all states are >50% as valuable as the most valuable state(s)
    • Unity of value, something like whether all the most valuable states are clumped together in state space, or whether the 'peaks' of the landscape are far apart and separated by valleys of zero or negative value
  • I think it's a true and important point that people currently converge on states they agree are high EV, because the option space is limited, and most good we value are still scarce instrumental goods — but when the option space grows, the latent disagreement becomes more important.
  • "I don’t know if my folk history is accurate, but my understanding is that early religions and cultures had a lot in common with each other"
    • I guess it depends on how you interpret ~indexical beliefs like "I want my group to win and the group over the hill to lose" — both sides can think that same thing, but might hate any compromise solution.
    • I think this is a reason for pessism about different values agreeing on the same states, and ∴ supportive of MPL.
  • Re brains, there are some (weak) reasons to expect finite optimal sizes, like speed of light. A 'Jupiter brain' is not very different from many smaller brains with high bandwidth (but laggy) communication.
  • I doubt how rare near-best futures are among desired futures is a strong guide to the expected value of the future. At least, you need to know more about e.g. the feasibility of near-best futures; whether deliberative processes and scientific progress converge on an understanding of which futures are near-best, etc.
    • There is an analogous argument which says: "most goals in the space of goals are bad and lead to AI scheming; AI will ~randomly initialise on a goal; so AI will probably scheme". But obviously whenever we make things by design (like cars or whatever), we are creating things which are astronomically unlikely configurations in the "space of ways to organise matter". And the likelihood that humans build cars just doesn't have much to do with what fraction of matter state space they occupy. It's just not an illuminating frame. The more interesting stuff is "will humans choose to make them", and "how easy are they to make". (I think Ben Garfinkel has made roughly this point, as has Joe Carlsmith more recently.)
finm
2
0
0
64% disagree

Partly this is because I think “extinction” as defined here is very unlikely (<<1%) to happen this century, which upper bounds the scale of the area. I think most “existential risk” work is not squarely targeted at avoiding literal extinction of all Earth-originating life.

I just googled “Phil Trammell new product varieties from AI could introduce ambiguities in accounting for GDP” because I wanted something to link to, and saw you'd posted this. Thanks for writing it up!

finm
23
3
1
1

It's worth noting that the average answers to “How much financial compensation would you expect to need to receive to make you indifferent about that role not being filled?” were $272,222 (junior) and $1,450,000 (senior).

And so I think that just quoting the willingness to pay dollar amounts to hire top over second-preferred candidate can be a bit misleading here, because it's not obvious to everyone that WTP amounts are typically much higher than salaries in general in this context. If the salary is $70k, for instance, and the org's WTP to hire you over the second-preferred candidate $50k, it would be a mistake to infer that you are perceived as 3.5 times more impactful.

Another way of reading this is that the top hire is perceived as about 23% and about 46% more 'impactful' respectively than the second-preferred hire in WTP terms on average. I think this is a more useful framing.

And then eyeballing the graphs, there is also a fair amount of variance in both sets of answers, where perceptions of top junior candidates' 'impactfulness' appear to range from ~5–10% higher to ~100% higher than the second-best candidate. That suggests it is worth at least asking about replaceability, if there is a sensitive way to bring it up!

I agree that people worry too much about replaceability overall, though.

Thanks! I'm not trying to resolve concerns around cluelessness in general, and I agree there are situations (many or even most of the really tough ‘cluelessness’ cases) where the whole ‘is this constructive?’ test isn't useful, since that can be part of what you're clueless about, or other factors might dominate.

Why do you think we ought to privilege the particular reason that you point to?

Well, I'm saying the ‘is this constructive’ test is a way to latch on to a certain kind of confidence, viz the confidence that you are moving towards a better world. If others also take constructive actions towards similar outcomes, and/or in the fullness of time, you can be relatively confident you helped get to that better world.

This is not the same thing as saying your action was right, since there are locally harmful ways to move toward a better world. And so I don't have as much to say about when or how much to privilage this rule!

Who can we thank for the design? It's seriously impressive!

Just I want to register the worry that the way you've operationalised “EA priority” might not line up with a natural reading of the question. 

The footnote on “EA priority” says:

By “EA priority” I mean that 5% of (unrestricted, i.e. open to EA-style cause prioritisation) talent and 5% of (unrestricted, i.e. open to EA-style cause prioritisation) funding should be allocated to this cause.

This is a bit ambiguous (in particular, over what timescale), but if it means something like “over the next year” then that would mean finding ways to spend ≈$10 million on AI welfare by the end of 2025, which you might think is just practically very hard to do even if you thought that more work on current margins is highly valuable. Similar things could have been said for e.g. pandemic prevention or AI governance in the early days!

Load more