Researcher at the Center on Long-Term Risk. All opinions my own.
This is linked to my discussion with Jim about determinate credences (since I didn’t initially understand this concept well, ChatGPT gave me a useful explanation).
FYI, I don't think ChatGPT's answer here is accurate. I'd recommend this post if you're interested in (in)determinate credences.
To be clear, "preferential gap" in the linked article just means incomplete preferences. The property in question is insensitivity to mild sweetening.
If one was exactly indifferent between 2 outcomes, I believe any improvement/worsening of one of them must make one prefer one of the outcomes over the other
But that's exactly the point — incompleteness is not equivalent to indifference, because when you have an incomplete preference between 2 outcomes it's not the case that a mild improvement/worsening makes you have a strict preference. I don't understand what you think doesn't "make sense in principle" about insensitivity to mild sweetening.
I fully endorse expectational total hedonistic utilitarianism (ETHU) in principle
As in you're 100% certain, and wouldn't put weight on other considerations even as a tiebreaker? That seems extreme. (If, say, you became convinced all your options were incomparable from an ETHU perspective because of cluelessness, you would presumably still all-things-considered-prefer not to do something that injures yourself for no reason.)
Thanks! I'll just respond re: completeness for now.
Why do you consider completeness self-evident? (Or continuity, although I'm more sympathetic to that one.)
Also, it's important not to conflate "given these axioms, your preferences can be represented as maximizing expected utility w.r.t. some utility function" with "given these axioms [and a precise probability distribution representing your beliefs], you ought to make decisions by maximizing expected value, where 'value' is given by the axiology you actually endorse." I'd recommend this paper on the topic (especially Sec. 4), and Sec. 2.2 here.
I mean, it seems to me like a striking "throw a ball in the air and have it land and balance perfectly on a needle" kind of coincidence to end at exactly — or indistinguishably close to — 50/50 (or at any other position of complete agnosticism, e.g. even if one rejects precise credences).
I don't see how this critique applies to imprecise credences. Imprecise credences by definition don't say "exactly 50/50."
Up until the last paragraph, I very much found myself nodding along with this. It's a nice summary of the kinds of reasons I'm puzzled by the theory of change of most digital sentience advocacy.
But in your conclusion, I worry there's a bit of conflation between 1) pausing creation of artificial minds, full stop, and 2) pausing creation of more advanced AI systems. My understanding is that Pause AI is only realistically aiming for (2) — is that right? I'm happy to grant for the sake of argument that it's feasible to get labs and governments to coordinate on not advancing the AI frontier. It seems much, much harder to get coordination on reducing the rate of production of artificial minds. For all we know, if weaker AIs suffer to a nontrivial degree, the pause could backfire because people would just use many more instances of these AIs to do the same tasks they would've otherwise done with a larger model. (An artificial sentience "small animal replacement problem"?)
I can accept the idea of X as an agent making decisions, and ask what those decisions are and what drives them, without implicitly accepting the idea that X has beliefs. Then "X has beliefs" is kind of a useful model for predicting their behaviour in the decision situations.
I think this is answering a different question, though. When talking about rationality and cause prioritization, what we want to know is what we ought to do, not how to describe our patterns of behavior after the fact. And when asking what we ought to do under uncertainty, I don’t see how we escape the question of what beliefs we’re justified in. E.g. betting on short AI timelines by opting out of your pension is only rational insofar as it’s rational to (read: you have good reasons to) believe in short timelines.
from my perspective the question of whether credences are ultimately indeterminate is ... not so interesting? It's enough that in practice a lot of credences will be indeterminate, and that in many cases it may be useful to invest time thinking to shrink our uncertainty, but in many other cases it won't be
I’m not sure what you’re getting at here. My substantive claim is that in some cases, our credences about features of the far future might be sufficiently indeterminate that overall we won’t be able to determinately say “X is net-good for the far future in expectation.” If you agree with that, that seems to have serious implications that the EA community isn’t pricing in yet. If you don’t agree with that, I’m not sure if it’s because of (1) thorny empirical disagreements over the details of what our credences should be, or (2) something more fundamental about epistemology (which is the level at which I thought we were having this discussion, so far). I think getting into (1) in this thread would be a bit of a rabbit hole (which is better left to some forthcoming posts I’m coauthoring), though I’d be happy to give some quick intuition pumps. Greaves here (the "Suppose that's my personal uber-analysis..." paragraph) is a pretty good starting point.
I'll just reply (for now) to a couple of parts
No worries! Relatedly, I’m hoping to get out a post explaining (part of) the case for indeterminacy in the not-too-distant future, so to some extent I’ll punt to that for more details.
without having such an account it's sort of hard to assess how much of our caring for non-hedonist goods is grounded in themselves, vs in some sense being debunked by the explanation that they are instrumentally good to care about on hedonist grounds
Cool, that makes sense. I’m all for debunking explanations in principle. Extremely briefly, here's why I think there’s something qualitative that determinate credences fail to capture: If evidence, trustworthy intuitions, and appealing norms like the principle of indifference or Occam's razor don’t uniquely pin down an answer to “how likely should I consider outcome X?”, then I think I shouldn’t pin down an answer. Instead I should suspend judgment, and say that there aren’t enough constraints to give an answer that isn’t arbitrary. (This runs deeper than “wait to learn / think more”! Because I find suspending judgment appropriate even in cases where my uncertainty is resilient. Contra Greg Lewis here.)
Is it some analogue of betting odds? Or what?
No, I see credences as representing the degree to which I anticipate some (hypothetical) experiences, or the weight I put on a hypothesis / how reasonable I find it. IMO the betting odds framing gets things backwards. Bets are decisions, which are made rational by whether the beliefs they’re justified by are rational. I’m not sure what would justify the betting odds otherwise.
how you'd be inclined to think about indeterminate credences in an example like the digits of pi case
Ah, I should have made clear, I wouldn’t say indeterminate credences are necessary in the pi case, as written. Because I think it’s plausible I should apply the principle of indifference here: I know nothing about digits of pi beyond the first 10, except that pi is irrational and I know irrational numbers’ digits are wacky. I have no particular reason to think one digit is more or less likely than another, so, since there’s a unique way of splitting my credence impartially across the possibilities, I end up with 50:50.[1]
Instead, here’s a really contrived variant of the pi case I had too much fun writing, analogous to a situation of complex cluelessness, where I’d think indeterminate credences are appropriate:
(I think forming beliefs about the long-term future is analogous in many ways to the above.)
Not sure how much that answers your question? Basically I ask myself what constraints the considerations ought to put on my degree of belief, and try not to needlessly get more precise than those constraints warrant.
I don’t think this is clearly the appropriate response. I think it’s kinda defensible to say, “This doesn’t seem like qualitatively the same kind of epistemic situation as guessing a coin flip. I have at least a rough mechanistic picture of how coin flips work physically, which seems symmetric in a way that warrants a determinate prediction of 50:50. But with digits of pi, there’s not so much a ‘symmetry’ as an absence of a determinate asymmetry.” But I don’t think you need to die on that hill to think indeterminacy is warranted in realistic cause prio situations.
Instead I'm saying that in many decision-situations people find themselves in, although they could (somewhat) narrow their credence range by investing more thought, in practice the returns from doing that thinking aren't enough to justify it, so they shouldn't do the thinking.
(I don't think this is particularly important, you can feel free to prioritize my other comment.) Right, sorry, I understood that part. I was asking about an implication of this view. Suppose you have an intervention whose sign varies over the range of your indeterminate credences. Per the standard decision theory for indeterminate credences, then, you currently don’t have a reason to do the intervention — it’s not determinately better than inaction. (I’ll say more about this below, re: your digits of pi example.) So if by “the returns from doing that thinking aren’t enough to justify it” you mean you should just do the intervention in such a case, that doesn’t make sense to me.
I agree that the particular guesses we make about aliens will be very speculative/arbitrary. But "we shouldn't take the action recommended by our precise 'best guess' about XYZ" does not imply "we can set the expected contribution of XYZ to the value of our interventions to 0". I think if you buy cluelessness — in particular, the indeterminate beliefs framing on cluelessness — the lesson you should take from Maxime's post is that we simply aren't justified in saying any intervention with effects on x-risk is net-positive or net-negative (w.r.t. total welfare of sentient beings).