The improving institutional decision-making (IIDM) working group (now the Effective Institutions Project) ran a survey asking members of the EA community which topics they thought were in scope of “improving institutional decision-making” (IIDM) as a cause area. 74 individuals participated.
I have the impression you asked people: is discussing about dogs or cats in the scope of improving decision in animal welfare? I would be very surprised if somebody did disagree.
That is a pity you stop at the presentation of the results. I believe the interesting part of the story is in the reason why some people disagreed. Would those reasons make the positive respondents change their mind? Are those negative answer a reason of concern for IIDM?
The survey results will help target the Effective Institutions Project’s priorities and work products going forward. For example, the list of in-scope topics will form the guardrails for developing a directory of introductory resources to IIDM.
I believe this is a more interesting question, and it certainly feels like IIDM is working toward a transparent and fair decision making approach.
However, results are presented but not discussed. I would be curious to hear an analysis about whether the population of the respondents could have had any impact on the results.
For example the lowest priority item "Compare IIDM to other cause areas" could rank low because respondents are already well aware of the topic or because the imagine the findings might not tell a favourable story.
Idealizing subjectivism: X is intrinsically valuable, relative to an agent A, if and only if, and because, A would have some set of evaluative attitudes towards X, if A had undergone some sort of idealization procedure.
I feel you've been discussing how confusing the consequences of the definition above are. Then, why don't you just drop the definition and revise it?
I would propose: X is intrinsically valuable, relative to an agent A belonging to a close-influence set of agents S, if and only if, and because, A and all the agents in S would have some set of evaluative attitudes toward X, if A and all agents in S had undergone some sort of idealization procedure.
And by close-influence set, I mean a set of agent that cannot be influenced by anything else outside the set.
I think that most of the concerns you are describing come from assuming that the idealisation process is personal and that there are multiple idealised evaluative attitudes toward something.
When you assume one unique evaluation, you can view that the agents are all trying to discover it. In the end of the process, there are no further questions, everybody agrees, and subjective is the same as objective. During the process, you have differences, personal evaluation, change of hearts, and all the chaos you describe.
Resources and time are probably limited to carry out the idealised process on every possible object X, but hopefully as a human race we can discover unanimous agreement on one or two big important questions within the next few millions of years.
Let me build a story-case for unanimous convergence.
Imagine you are troglodyte, and you are trying to assess how far the hunting ground is. You need to estimate where you are, and what time of the day is, because being at the wrong place at the wrong time means either meeting a stronger predator or missing your target. Now, how do you evaluate the way you measure time? Do you have a preference for looking at the sun, perhaps it is a cloudy day, is it a good idea? Do you prefer to listen to your human body rhythm (when you are hungry)? Do you follow somebody example? Do you look at the rain? Do you watch the behaviour of the animals around you? Do you draw symbol on the ground to recall your way? Do you break tree branches? Do you leave a trail of stones? Do you dig a road?
The troglodyte is probably going to be in dilemma and debate the issue strongly with his clan-mates (like it happens to the agents in many of the scenarios you discuss about). Nowadays how we measure time and map location in our everyday life is something we mostly all agree.
That is a very nice bibliography exploration software.
May I ask you what the 2D dimensions of the graph represent?
Are they dimensions of maximal variance obtained from principal component analysis or are they two specific properties?
Do you think it could be helpful to publish the weight of the dimensions along side the graph?
What about the number of the articles, what dictates what is included and what is excluded?
Is there any way to include or exclude more articles?
What data are you using for the categorisation? Is this all objective data (such as dates and number of citation)? Or is there some subjective data (like journal ranking or any other heuristic hand-crafted formula).
Your definition of moral catastrophe is based on historical measurable effects. It does not take into account internal human experiences, and it does not completely represent those subtle changes of human thinking and behaviours that could be considered immoral. I would argue that the moral catastrophe is already in small every day immoral choices that slowly creep in the mind of people and become normal patterns of thinking.
There are moral catastrophe that lead to multiple catastrophic events like the idea of race superiority that eventually leads to slavery, and the holocaust.
There are catastrophic events that are the consequence of perfectly moral habits, like the spread of pandemic due to people taking care of their sick family.
I would suggest to revise your definition of moral catastrophe as "a pattern of thinking and behaviours that are the subtle cause of repeated situation-independent suffering for a large group of people".
Under this definition, the idea that men and women have distinct roles in society can be regarded as a moral catastrophe, as it caused many women to suffer unhappy lives throughout all human history.
In a way, this is similar to ineffective philantropy in general - perhaps "ineffective grantmaking" would be an appropriate heading?
That sounds a better heading indeed. Although grantmakers define the value of a research outcome, they might not be able to correctly promote their vision due to their limited resources.
However, as the grantmaking process is what defines the value of a research, your heading might be misinterpreted as the inability to define valuable outcomes (which is in contradiction with your working hypothesis)
What about "inefficient grant-giving"? "inefficient" because sometimes resources are lost pursing secondary goals, "grant-giving" because it specifically involves the process of selecting motivated and effective researchers teams.
I/we would love to get input on this mapping [...] ii. any of the problems described here is overstated.
Point 2.3 "founding priorities of grantmakers " does not sound a problem to me in the context of your analysis. In the opening of your post, you show concern in the production of a valuable results:
Instead, I will just assume that when we dedicate resources to research we are expecting some form of valuable outcome or impact.
Who is supposed to define the valuable outcome if not a grantmaker? Are you perhaps saying that specific grantmakers are not suitable to define the valuable outcome?
Nonetheless you can see that this a circular problem, you cannot defend the idea of efficient research toward a favourable outcome without allowing the idea that somebody has the authority to define that outcome and thus become a grantmaker.
Perhaps, would 'lack of alignment between grantmakers values and researchers values' be a better definition of the issue?
I/we would love to get input on this mapping [...] there are significant issues that jeopardize the value of research results which are not included in this post
I would say that boldness itself is a problem as much as lack of boldness is.
As there is a competition to do more with less resources, researchers are incentivised to underestimate what they need when writing grant proposals. Research is an uncertain activity and likely benefits from a safety budget. So when project problems arise, researches need to cut some planned activities (for example, they might avoid collecting quality data for reproducibility results).
Researchers are also incentivised to keep their options open and distribute their energy across several efforts. Thus, an approved short-term project might not receive full attention (as well as it happens to long-term projects toward their ends) or might not be perfectly aligned with the researchers interests.
On the other hand, making evaluations public is more informative for readers, who may acquire better models of reality if the evaluations are correct,
I am in agreement. Please, let me note that people can still get a good model of reality even if they do not know the names of the people involved.
If evaluations did not contain the name of the subjects, do you think it would still be easy for readers to connect the evaluation to the organisations being evaluated? Perhaps you could frame the evaluation so that links are not clear.
or be able to point out flaws if the evaluation has some errors.
Although this is the reviewer's responsibility, it would be nice to have extra help indeed. (Is this you goal?) Though, the quality of feedback you receive is linked to the amount of information you share, and specific organisation details might be important here. Perhaps, you could share the detailed information to a limited set of interested people working while asking them to sign a confidentiality agreement.
I'd also be curious about whether evaluators generally should or shouldn't give the people and organizations being evaluated the chance to respond before publication.
Would that make the reviewers change their mind?
If there is a specific issue the reviewer is worried about, I believe the reviewer can query the organisation directly.
If it is a more general issue, it is likely to be something the reviewer need to do further research about. Probably the reviewer does not have enough time to carry out the needed research, and a rushed evaluation does not help.
Nonetheless, it is important to give the organisations an opportunity to give a post-evaluation feedback, so that the reviewer has chance to address the general issue before the next round of reviews.
Furthermore, let's not forget that one of the evaluation criteria is the ability of the applicants to introduce the problem, describe clearly the plans and address risks and contingencies. If something big is missing, it is generally a sign that the applicant needs a bit more time to complete the idea, and the reviewer should probably advise waiting for the next round.
If that's not what organization like the FLI are for, what's are they for?
They do their best to gather data, predict events on the base of the data and give recommendations. However data is not perfect, models are not a perfect representation of reality, and recommendations are not necessarily unanimous. To err is human, and mistakes are possible, especially when the foundation of the applied processes contain errors.
Sometimes people just do not have enough information, and certainly nobody can gather information if data does not exist. Still a decision needs to be taken, at least between action vs inaction, and a data-supported expert guess is better than a random guess.
Given a choice, would you prefer nobody carried out the analysis with no possibility of improvement? or would you still let the experts do their job with a reasonable expectation that most of the times, the problems are solved and human conditions improve?
What if their decision had only 10% change of being better than a decision taken without carrying out any analysis? Would you seek expert advice to improve the odd of success, if that was your only option?
Why didn't the big EA organizations listen more?
I realise the article excerpt you showed is not an accurate estimation. Marc and Thomas also say:
The record of laboratory incidents and accidental infections in biosafety level 3 (BSL3) laboratories provides a starting point for quantifying risk. Concentrating on the generation of transmissible variants of avian influenza, we provide an illustrative calculation of the sort that would be performed in greater detail in a fuller risk analysis. Previous publications have suggested similar approaches to this problem
...
These numbers should be discussed, challenged, and modified to fit the particularities of specific types of PPP experiments.
So it looks like the calculation above was just an illustrative examples, and EA did not have sufficient data to come to conclusions. Is there any other part of the article that leads you believe the authors had strong faith in their numbers?
Given that the got this so wrong, why should we believe that there other analysis of Global Catastrophic Risk isn't also extremely flawed?
What did EA get wrong exactly? I guess they made rational decisions in a situation of extreme uncertainty.
Statistical estimation with little historical data is likely to be inaccurate. A virus leak did not turn into a pandemic before.
Furthermore, many accurate estimations are likely to lead to a bad outcome sometimes. If you throw 100 dice enough times, you will get all 1s eventually.
Your question:
Answer from the post:
I think the emphasis is on the relationship with the EA community. You do not need to be an EA-dedicated consultancy team, but you should have some group dedicated to serving EA interests.
I believe this is what all consultancy firms do. They take care of their customer organisations, by becoming familiar with the expectation, aspirations, and goals. (And it is easier if the people carrying out the work share the same aspirations of the organisations they serve, just because they are likely to be more receptive)
Here, the post is only asking new or old consultancy to give some attention to the EA community.