Summary
Impact markets (that encourage retrospective funding, and especially if they allow resale of impact) have a severe downside risk: they can incentivize risky projects that are likely to be net-negative due to allowing people to profit if they cause positive impact while not inflicting a cost on them if they cause negative impact. This risk is hard to mitigate.
Impact markets themselves are therefore such a risky project. To avoid the conflict of interest issues that arise, work to establish impact markets should only ever be funded prospectively (never retrospectively).
The risk
Suppose the certificates of a risky project are traded on an impact market. If the project ends up being beneficial, the market allows the people who own the certificates to profit. But if the project ends up being harmful, the market does not inflict a cost on them. The certificates of a project that ended up being extremely harmful are worth as much as the certificates of a project that ended up being neutral, namely nothing. Therefore, even if everyone believes that a certain project is net-negative, its certificates may be traded for a high price due to the chance that the project will end up being beneficial.[1]
Impact markets can thus incentivize people to create or fund net-negative projects. Denis Drescher used the term "distribution mismatch" to describe this risk, due to the mismatch between the probability distribution of investor profit and that of EV.
It seems especially important to prevent the risk from materializing in the domains of anthropogenic x-risks and meta-EA. Many projects in those domains can cause a lot of accidental harm because, for example, they can draw attention to info hazards, produce harmful outreach campaigns, produce dangerous experiments (e.g. in machine learning or virology), shorten AI timelines, intensify competition dynamics among AI labs, etcetera.
Mitigating the risk is hard
The Toward Impact Markets post describes an approach that attempts to mitigate this risk. The core idea is that retro funders should consider the ex-ante EV rather than the ex-post EV if the former is smaller. (The details are more complicated; a naive implementation of this idea would incentivize people to launch a safe project and later expand it to include high-risk high-reward interventions.)
We think that this approach cannot be relied upon to sufficiently mitigate the risk due to the following reasons:
- For that approach to succeed, retro funders must be familiar with it and be sufficiently willing and able to adhere to it. However, some potential retro funders are more likely to use a much simpler approach, such as "you should buy impact that you like".
- Other things being equal, simpler approaches are easier to communicate, more appealing to potential retro funders, more prone to become a meme and a norm, and more likely to be advocated for by teams who work on impact markets and want to get more traction.
- If there is no way to prevent anyone from becoming a retro funder, being careful about choosing/training the initial set of retro funders may not help much. Especially if the market allows people to profit from outreach interventions that attract new retro funders who are not very careful.
- The price of a certificate tracks the maximum amount of money that any future retro funder will be willing to pay for it. Prudent retro funders do not (significantly) offset the influence of imprudent retro funders on the prices of certificates of net-negative projects.
- Traditional (prospective) charitable funding can have a similar dynamic; one only needs one funder to support a project even if everyone else thinks it’s bad. Impact markets make the problem much worse, though, because they add variance from uncertainty about project outcomes as well as variance in funder views.
- Suppose that a risky project that is ex-ante net-negative ends up being beneficial. If retro funders attempt to evaluate it after it already ended up being beneficial, hindsight bias can easily cause them to overestimate its ex-ante EV. This phenomenon can make the certificates of net-negative projects more appealing to investors, already at an early stage of the project (before it is known whether the project will end up being beneficial or harmful).
The conflict of interest problem for establishing impact markets
Can we just trust that people interested in establishing impact markets will do so only if it’s a good idea? Unfortunately the incentivization of risky projects applies at this level. If someone establishes an impact market and it has large benefits, they might expect to be able to sell their impact in establishing the markets for large amounts of money. On the other hand if they establish impact markets and they cause large harms, they won’t lose large amounts of money.
Establishing impact markets would probably involve many high-stakes decisions under great uncertainty. (e.g. should an impact market be launched? Should the impact market be decentralized? Should a certain person be invited to serve as a retro funder? Should certain certificates be deleted? What instructions should be communicated to potential market participants?) We should protect the integrity of these decisions by insulating them from conflicts of interest.
This point seems important even conditional on the people involved being the most careful and EA-aligned people in the world. (Because they are still humans, and humans' judgment is likely to be affected by biases/self-deception when there is a huge financial profit at stake).
Suggestions
- Currently, launching impact markets seems to us (non-robustly) net-negative. The following types of impact markets seems especially concerning:
- Decentralized impact markets (in which there are no accountable decision makers that can control or shut down the market).
- Impact markets that allow certificates for risky interventions, and especially interventions that are related to the impact market itself (e.g. recruiting new retro funders).
- On the other hand, we’re excited about work to further understand the benefits and costs of different funding structures. If there were a robust mechanism to allow the markets to avoid the risks discussed in this post (& ideally handle moral trade as well), we think impact markets could have very high potential. We just don’t think we’re there yet.
- In any case, launching an impact market should not be done without (weak) consensus among the EA community, in order to avoid the unilateralist's curse.
- To avoid tricky conflicts of interest, work to establish impact markets should only ever be funded in forward-looking ways. Retro funders should commit to not buying impact of work that led to impact markets (at least work before the time when the incentivization of net-negative projects has been robustly cleared up, if it ever is). EA should socially disapprove of anyone who did work on impact markets trying to sell impact of that work.
- All of this relates to markets which encourage retrospective funding (especially but not exclusively if they also allow for the resale of impact).
- In particular, this is not intended to apply to introducing market-like mechanisms like explicit allocation of credit between contributors to projects. While such mechanisms may be useful for supporting impact markets, they are also useful in their own right (for propagating price information without distorting incentives), and we’re in favour of experiments with such credit allocation.
- ^
The risk was probably first pointed out by Ryan Carey.
Ofer (and Owen), I want to understand and summarize your cruxes one by one, in order to sufficiently pass your Ideological Turing Test that I can regenerate the core of your perspective. Consider me your point person for communications.
Crux: Distribution Mismatch of Impact Markets & Anthropogenic X-Risk
If I understand one of the biggest planks of your perspective correctly, you believe that there is a high-variance normal distribution of utility centered around 0 for x-risk projects, such that x-risk projects can often increase x-risk rather than decrease it. I have been concerned for a while that the x-risk movement may be bad for x-risk, so I am quite sympathetic to this claim, though I do believe some significant fraction of potential x-risk projects approach being robustly good. That said I think we are basically in agreement that a large subset of potential mathematically realisable x-risk projects actually increase it, though it's harder to be sure about the share of it in-practice with real x-risk projects given that people generally if not totally avoid the obviously bad stuff.
The examples you are most concerned by in particular are biosecurity and AI safety (as mentioned in a previous comment of yours), due to potential infohazards of posts on the EA Forum, as well as meta EA mentioned above. You have therefore suggested that impact markets should not deal with these causes, either early on such as during our contest or presumably indefinitely.
Let me use at least one example set of particular submissions that may fall under these topics, and let me know what you think of them.
I was thinking it would be quite cool if both Yudkowsky and Christiano respectively submitted certificates for their posts, 'List of Lethalities' and 'Where I agree and disagree with Eliezer'. These are valuable posts in my opinion and they would help grow an impact marketplace.
My model of you would say either that:
1) funding those particular posts is net bad, or
2) funding those two posts in particular may be net good, but it sets a precedent that will cause there to be further counterfactual AI safety posts on EA Forum due to retroactive funding, which is net bad, or
3) posts on the EA Forum/LW/Alignment Forum being further incentivized would be net good (minus stuff such as infohazards, etc), but a more mature impact market at scale risks funding the next OpenAI or other such capabilities project, therefore it's not worth retroactively funding forum posts if it risks causing that.
I am tentatively guessing your view is something at least subtly different from those rough disjunctions, though not too different.
Looking at our current submissions empirically, my sense is that the potentially riskiest certificate we have received is 'The future of nuclear war' by Alexei Turchin. The speculation in it could potentially provide new ideas to bad actors. I don't know, I haven't read/thought about this one in detail yet. For instance, core degasation could be a new x-risk but it also seems highly unlikely. This certificate could also be the most valuable. My model of you says this certificate is net-negative. I would agree that it may be an example of the sort of situation where some people believe a project is a positive externality and some believe it's a negative externality, but the mismatch distribution means it's valuated positively by a marketplace that can observe the presence of information but not its absence. Or maybe the market thinks riskier stuff may win the confidence game. 'Variance is sexy'. This is a very provisional thought and not anything I would clearly endorse; I respect Alexei's work quite highly!
After your commentary saying it would be good to ban these topics, I was considering conceding that condition because it doesn't seem too problematic to do so for the contest, and by and large I still think that, though again I would also specifically quite like to see those two AI posts submitted if the authors want that.
I'm curious to know your evaluation of the following possible courses of action, particularly by what percentage your concern is reduced vs other issues:
That list is just a rough mapping of potential actions. I have probably not characterized sufficiently well your position to offer a full menu of actions you may like to see taken regarding this issue.
tl;dr is that I'm basically curious 1) how much you think the risk is dominated by mismatch distribution applying specifically to x-risk vs say global poverty, 2) on which timeframes it is most important to shape the cause scope of the market in light of that (now? at full scale? both?), 3) whether banning x-risk topics from early impact markets (in ~2022) is a significant risk reducer by your lights.
(Meta note: I will drop in more links and quotes some time after publishing this.)
I think that most interventions that have a substantial chance to prevent an existential catastrophe also have a substantial chance to cause an existential catastrophe, such that it's very hard to judge whether they are net-positive or net-negative (due to complex cluelessness dynamics that are caused by many known and unknown crucial considerations).
... (read more)