I'm new to EA and still getting oriented. Will update more later.
Sorry in advance for the long comment, I am attempting to explain statistics concepts in a helpful/accessible way to a broad audience, because I think it's really important to consider uncertainty well when we make important decisions.
While I love the idea of a BOTEC model for comparing impact of community building interventions, this seems like a formula that is likely susceptible to the flaw of averages. The flaw of averages is the idea that systematic errors occur when we base calculations on the average (expected value) of uncertain inputs, rather than the entire distribution of possible inputs.
So when in your formula you use i (average impact potential of participants), that seems like a potentially major oversimplification of the reality that you might see a wide range of impact potentials within a certain intervention. Relying on averages in this way is known to actually mislead decision making (fields like risk management need to pay attention to this, a specific example is flood damage modeling, but it crops up all over the place).
For a simple example: say you are doing a fundraiser targeting students and parents (I'm using $ because it's easy to quantify and understand). You expect to reach 90 students and only 10 parents. You expect the likelihood increase for making a donation for students is 0.8, and the likelihood increase for donations from parents is 0.4.
The vast majority of participants are students, so you decide to classify your "average giver" as a student (since student vs parent are discrete categories). Having decided that the "average" target of the fundraiser is other students, you predict that the average participant who makes a donation will give $10. So if you convince 100 people to donate, by the formula, your expected value is $10*100*0.8=$800. Your predicted average donation is $8.
However, the reality is a bit different, because maybe your fundraiser reaches 90 students and 10 parents. Parents in this scenario have much greater disposable income, such that the parents who donate will each give $100. So now, your expected value is $10*90*0.8+$100*10*0.4=$1,120. Your actual average donation is now $11.2 per person, which is a 40% increase over your predicted average donation.
This is a simplified/exaggerated example of how heavy tailed distributions (which can be thought of as the technical term for black swan events) can distort statistics. Quantities like income and wealth are heavy tailed in reality. I think you could make the case that actual impact as an outcome (vs impact potential) is heavy tailed as well. Open Philanthropy seems to agree with this based on their Hits-based Giving approach. And you actually mention that the"Open Phil Survey uses an impact scale of four orders of magnitude between people working on longtermist causes".
"The problem with observations or events from a heavy tail is that even though they are frequent enough to make a difference, they are rare enough to make us underestimate them in our experience and intuitions." This article "Heavy Tails and Altruism: When Your Intuition Fails You" has more detail on the topic.
Maybe your formula could allow for summation of different projected classes of participants? This could help it account for lower-likelihood/lower frequency, but higher-impact-potential participants. Since interventions could vary significantly in terms of relevance of these "black swans," accounting for them seems important to me. You may be hinting at this approach in your explanation of example 2.
Given that we care more about actual impact than impact potential, I personally feel pretty cautious about promoting a movement-wide approach to community building that might limit creativity/innovation/experimentation, and that potentially favors narrowing down over broadening outward.
P.S. - I want to acknowledge that it's much easier to critique other people's work than to attempt something from scratch. So I also wanted to thank you for writing this up and sharing it! I like the overall idea, enjoyed learning about the existing approaches thanks to your research, and appreciate you working on community building so thoughtfully. Thank you!
With regard to BOTEC tools, if Excel interoperability seems important (and I agree that it does seem important), I'd encourage anyone interested in working on this to check out SIPmath (an Excel-plugin for Monte Carlo simulation). Creating a SIPmath model requires the plugin, but once the model is created, it will run for anyone who has Excel (even if they don't have the plugin). SIPs (Stochastic Information Packets) are also platform agnostic which seems like it could be useful.