I joined the psychology department at UCLA as an Assistant Professor in July 2023. Prior to that, I was an independent research group leader at the MPI for Intelligent Systems in Tübingen. I completed my Ph.D. in the Computational Cognitive Science Lab at UC Berkeley in 2013, obtained a master’s degree in Neural Systems and Computation from ETH Zurich, and completed two simultaneous bachelor's degrees in Cognitive Science and Mathematics/Computer Science at the University of Osnabrück.
The Gemini summary is inaccurate. Instead, the key idea is to ask people to rate each experience on an unconstrained scale, with a reference point that is the same for everyone. For instance, one could ask people to place their palm on a desk, then put a jug filled with three gallons of water on top of it, and then ask, "If the intensity of the pain you are feeling now is 20, then what number best represents the intensity of the suffering you felt when X happened?" for different events X.
I also find it problematic that they end the paragraph with "QED." "QED" is a technical term used to indicate that a mathematical theorem has been proven. The quoted verbal argument clearly does not meet the rigorous standards of mathematical proof. This looks like an attempt to exploit superficial, intuitive heuristics to persuade readers to believe the conclusion with a level of confidence that is unwarranted by the information in the quoted paragraph.
Contrary to their claim that "it would have been very hard to predict that humans would like ice cream, sucralose, or sex with contraception," I think it was predictable that these preferences would likely result from natural selection under constraints. In each of these examples, a mechanism that evolved to detect the achievement of an instrumentally important subgoal is triggered by a stimulus that is i) very similar to the stimuli an animal would experience when the subgoal is achieved, ii) did not exist in the evolutionary environment. We should expect any (partially or fully) optimized bounded agent to have detectors for the achievement of instrumentally important subgoals. We should expect these detectors to only analyze a limited number of features with limited precision. And we should expect the limited number of comparisons they perform precisely to be optimized for distinctions that were important for success on the training data.
Given that these failures were predictable, it should be possible to systematically predict many analogous failures that might result from training AI systems on specific data sets or (simulated) environments. If we can predict such failures of generalization beyond the training data, then we might be able to either prevent them, mitigate them, or regulate real-world applications so that AI systems won't be applied to inputs where misclassification is likely and problematic. The latter approach is analogous to outlawing highly addictive drugs that mimic neurotransmitters signalling the achievement of instrumentally important subgoals.
Morality is Objective
I believe that the purpose of morality is to promote everyone's well-being. We can use the scientific method to determine how each action, rule, trait, and motive affects overall well-being. Science is objective. Therefore, it is possible to make objective statements about the morality of actions, motives, and traits.
The method would probably work less well with an imagined reference scenario unless people have experienced something similar. I also agree that one could pair this method with the time-tradeoff method. It might work better because people are better at making decisions than at making numerical judgments. On the other hand, the subjective decision utility of pain is probably not linear in its duration. By that, I don't mean to claim that the amount of suffering doesn't linearly increase with the duration of pain. Instead, I am claiming that people's decision-making values the duration of pain in ways that may be irrational.