I do independent research on EA topics. I write about whatever seems important, tractable, and interesting (to me).
I have a website: https://mdickens.me/ Much of the content on my website gets cross-posted to the EA Forum, but I also write about some non-EA stuff like [investing](https://mdickens.me/category/finance/) and [fitness](https://mdickens.me/category/fitness/).
My favorite things that I've written: https://mdickens.me/favorite-posts/
I used to work as a software developer at Affirm.
Not OP but I would say that if we end up with an ASI that can misunderstand values in that kind of way, then it will almost certainly wipe out humanity anyway.
That is the same category of mistake as "please maximize the profit of this paperclip factory" getting interpreted as "convert all available matter into paperclip machines".
I don't cite LLMs for objective facts.
In casual situations I think it's basically okay to cite an LLM if you have a good sense of what sorts of facts LLMs are unlikely to hallucinate, namely, well-established facts that are easy to find online (because they appear a lot in the training data). But for those sorts of facts, you can turn on LLM web search and it will find a reliable source for you and then you can cite that source instead.
I think it's okay to cite LLMs for things along the lines of "I asked Claude for a list of fun things to do in Toronto and here's what it came up with".
Yes, it could well be that an LLM isn't conscious on a single pass, but it becomes conscious across multiple passes.
This is analogous to the Chinese room argument, but I don't take the Chinese room argument as a reductio ad absurdum—unless you're a substance dualist or a panpsychist, I think you have to believe that a conscious being is made up of parts that are not themselves conscious.
(And even under panpsychism I think you still have to believe that the composed being is conscious in a way that the individual parts aren't? Not sure.)
I don't find the Turing test evidence as convincing as you present it here.
Fair enough, I did not actually read the paper! I have talked to LLMs about consciousness and to me they seem pretty good at talking about it.
I agree that if each token you read is generated by a single forward pass through a network of fixed weights, then it seems hard to imagine how there could be any 'inner life' behind the words. There is no introspection. But this is not how the new generation of reasoning models work. They create a 'chain of thought' before producing an answer, which looks a lot like introspection if you read it!
The chain of thought is still generated via feed-forward next token prediction, right?
A commenter on my blog suggested that LLMs could still be doing enough internally that they are conscious even while generating only one token at a time, which sounds reasonable to me.
That paper is long and kind of confusing but from skimming for relevant passages, here is how I understood its arguments against computational functionalism:
I didn't understand these arguments very well but I didn't find them compelling. I think the China brain argument is much stronger although I don't find it persuasive either. If you're talking to a black box that contains either a human or a China-brain, then there is no test you can perform to distinguish the two. If the human can say things to you that convince you it's conscious, then you should also be convinced that the China-brain is conscious.
“What probability do you put on future AI advances causing human extinction or similarly permanent and severe disempowerment of the human species?” – median: 5% – mean: 16.2%
“What probability do you put on human inability to control future advanced AI systems causing human extinction or similarly permanent and severe disempowerment of the human species?” – median: 10% – mean: 19.4%
Am I missing something, or are these answers nonsensical? On my reading, the 2nd outcome is a strict subset of the 1st outcome, so the probability can't be higher. But the median given probability is twice as high.
I've seen much written that takes it as a premise that you shouldn't concede to a Pascal's mugging, but I've seen very little about why not.
(I can think of arguments for not conceding in the actual Pascal's mugging thought experiment: (1) ignoring threats as a game-theoretic strategy and (2) threats of unlikely outcomes constituting evidence against the outcome. Neither of these apply to caring about soil nematodes.)