I don't think the 'next-token' aspect has any bearing at all. That models emit one token at a time is just about the interface we allow them to have. But it doesn't limit the model's internal architecture to just predict one token at a time. Indeed, given the remarkable coherence and quality of LLM responses (including rarely, if ever, getting stuck where a sentence can't be meaningfully completed) is evidence it IS considering more than just the next token. And indeed there's now direct evidence LLM's think far ahead https://www.anthropic.com/research/tracing-thoughts-language-model. Just one example, when asked to rhyme when writing out the first sentence, the model will already internally have considered what words could form a rhyme in the second sentence.
Our use and training of LLM's is focused on next-token, and for a simple model with few parameters it will indeed by very simple, just looking at the frequency distribution given the previous word etc. But when you search for the best model with billions of parameters things radically change - here, the best way of the model to predict the next token is to develop ACTUAL intelligence which includes thinking further ahead, even though our interface to the model is simpler.
But the next-token aspect aside - if any Turing-machine-like system is accepted as conscious it leads down the path of panpsychism. Think of how many realizations of Turing-machines exist. If you accept one as conscious, you have to accept them all. Why? Because you can transform the initial conscious program to run on any Turing machine and given its input/output will be exactly the same in all situations including in discussions about consciousness then it stands to reason it will be conscious in all realizations: Anything else is the same as saying that discussions about consciousness are completely unrelated to actually experiencing consciousness, and in effect it is a coincidence that we walk about consciousness as if we are conscious, because they are not causally related.
If we accept that all realizations of the computation (including those on cog wheels, organ pipes, pen-and-paper calculations) then we have a situation like "OK, consciousness can run on a computer - but what is a computer?". It is of course possible to argue that only very specific computational patterns generate consciousness, but is it really believable that this is what it takes, no matter how radically we transform the Turing-machine, and where it becomes really a matter of interpretation if there is a Turing machine at all.
Further, we would need to accept that all those radical transformations of the Turing machine doesn't cause even the slightest change in experience of the subject (it can't, because input/output is identical under all the transformation).
If one is not ready to accept that by reductio ad absurdum we need to reject Turing machines can be conscious in the first place.
Or we need to accept a panpsychist view - everything is in some sense conscious under some interpretation. W