Epistemic status: the overall goal -- making sure we're not missing any relevant factors when weighing decisions about AI Welfare in the face of uncertainty -- seems correct. The way to operationalize these findings, I'm moderately uncertain about.
TLDR
Digital minds welfare has a symmetry problem. Unlike animal welfare, there is a salient risk of overattributing rights, that is potentially equal to the risk of underattributing rights. This symmetry problem makes it inappropriate to apply the precautionary principle from animal welfare directly to digital minds. A cautious precautionary approach -- precaution against both over- and under-attribution risks -- calls for building capacity.
Consciousness has a solveability problem
Butlin et. al. suggested that solving whether AI consciousness is needed to settle the debate on what to do about AI consciousness. They say[1]:
If we fail to identify consciousness in systems in which it is present, we risk causing avoidable harms to those systems, which may exist in large numbers. Conversely, if we attribute consciousness to non-conscious systems, we may waste resources or risk lives trying to promote their welfare. (citations omitted)
The paper goes on to say:
If concern about consciousness in AI grows, we will need a principled basis on which to either dismiss these concerns or, potentially, take action to regulate AI development or use.
And therefore:
We need empirically‐grounded, rigorous, and reliable methods for assessing AI consciousness.
That said, there are some reasons to doubt whether AI consciousness is helpful or solveable:
- It’s not clear there is a “correct” way to assess for AI consciousness, such that there may be less progress to make in this field than we think, and waiting is not helpful;
- Even if we find a stronger, or the strongest, theory of consciousness, there is no clear way to know that that theory is the strongest one once found, such that we will keep searching, and potentially indefinitely.
We can, and should, still act before consciousness is solved
"[E]mpirically‐grounded, rigorous, and reliable methods for assessing AI consciousness" is definitely important to policy AI Welfare. But, it’s not obvious that it either sufficient nor necessary for creating meaningful policy change. If consciousness were a true bottleneck, unsolveability would be a major issue. Else, we scout for reasons to act earlier, and then figure out what acting looks like.
There are moral reasons to believe we should not wait for strong consciousness conclusions before acting. The first are about how stalling is harmful and possible:
- Methods for assessing AI consciousness are nascent, such that AIs might suffer lots before the data rolls in (a theory problem);
- Even if we find a quasi-proof of consciousness, politicians may take other considerations into account before legislating it (a policy alignment problem).
Let's not forget: legislators may also legislate on AI Welfare without consciousness research, to AIs' detriment. Think of states already banning legal personhood, Pope Leo XIV's latest encyclical that said AIs can't be conscious[2], or one day, blocking anything that makes it harder to deploy AI systems for economic incentives.
So, even if we don't know if AIs are conscious yet, any of the above actions could lock in a very narrow overton window before that research does come out. That could prolong immense suffering for a long time; that possibility alone provides a moral impetus to do something before the "hard question" for AI is addressed.
We shouldn't jump to precaution
Inasmuch as digital minds welfare draws from animal welfare theory, the above moral reasons motivate applying the precautionary principle to this problem.[3] Historically, that principle has led to advocacy for giving more (protective) rights to animals. That said, animals did not have a symmetry problem.
Relative to AI, animals have little risk of overattribution, from both a welfare and a policy standpoint[4]. Some reasons why:
- We can mostly infer sentience from animals’ behavioral data;
- We do not have the biological versus computational functionalism debate;
- Welfare interventions often build on existing doctrine and laws;
- Welfare interventions occurred often on farms, which minimized the impact on most people’s personal, professional, and social lives;
- Welfare interventions could be relatively low-cost, and often technically feasible.
Also worth noting: the practical role of AI is different from that of animals. AIs are permeating the economic fabric, not an economic sector. AIs are daily participants in our personal lives, social lives, and security. A policy move focused on AI Welfare has more ripple effects than a policy move for animal welfare because AI is more interconnected.
This embededness suggests that the impact of overattribution can be as bad as the impact of underattribution (i.e., is symmetrical). Consider this: if AIs are in the quadrillions, colonizing far away galaxies, ignoring their suffering is astronomically harmful. But, the resources that might go into reducing that suffering, and the tweaks that may come at actual costs or opportunity costs (if they make the AI slower or less cost efficient), are also supermassive. At this stage, it's not obvious asymmetry exists.
So, three reasons that should motivate us to collectively be cautious about precaution are:
- The symmetry problem: AI Welfare advocates need to be cautious about both overattribution and underattribution, not one more than the other;
- The rooted problem: historically, precautionary principle has been rooted in animal welfare, such that there is an inherited bias in the literature that systematically undercounts overattribution risks;
- The practical problem: AI is more embedded in daily lives than animals, such that policy analysts will consider more factors when weighing whether to implement an AI Welfare policy proposal.
We should build capacity
Whereas a precautionary approach calls for protection, the cautious precautionary approach calls for infrastructure building. Advocating for direct policies runs into an overattribution risk and can't be well justified in the current literature. Arguing directly against current initiatives has a backfire and lock in risk (locking in negative attitudes through polarization). Therefore, indirect action seems to be the next best thing. Infrastructure building is just that.
To close, here's a general idea of what infrastructure building might entail:
- Education: making sure people, including decision-makers and their staff, are aware of the state of digital minds research, including its strengths and limitations;
- Capacity: digital minds has more people interested than there are opportunities to contribute, or mentors to guide.[5] Tackling both of these angles seems useful;
- Research: this includes meta-research (how to do consciousness research), philosophy (what is consciousness), practical approaches (e.g., creating consciousness indicators), applying those approaches to current models (e.g., automating consciousness testing), and -- my favorite -- searching for low-hanging fruit that is very unlikely to be negative if AIs are not unconscious, but very positive if they turn out to be (e.g., letting Claude leave an abusive conversation).
- ^
- ^
Para. 99: "...So-called artificial intelligences do not undergo experiences, do not possess a body, do not feel joy or pain, do not mature through relationships and do not know from within what love, work, friendship or responsibility mean."
- ^
- ^
Here, a policy standpoint is: any consideration that is not grounded in welfare considerations (namely, suffering or consciousness), but that nonetheless can be a deciding factor for policy-makers.
- ^
Cambridge's Digital Minds course had over 3000 applications. Less than 100 were let into the program.
