David Thorstad

Assistant Professor of Philosophy @ Vanderbilt University
1813 karmaJoined www.dthorstad.com

Bio

I primarily write academic papers and do outreach through my blog. I do try to post here when possible (and I always appreciate cross-posts!), but please do check dthorstad.com for my academic papers and reflectivealtruism.com for outreach.

Comments
120

Glad to be working with Kuhan!

There's not very much evidence for existential risk from biological causes. I've had a very hard time getting anyone to even tell me what they are concerned about, and when they do it is not very plausible.

Good question! I'm assuming just the .pdf, which is what Oxford University Press does for most academic books. But I will check and let you know if they're doing an epub.

Since the book is open access, I'm not sure if they secured the pdfs. If they didn't you could try a converter?

Ilan and Shakked Noy have a nice piece on one aspect of this, "The Short-Termism of ‘Hard’ Economics." It's forthcoming in Essays on longtermism, hopefully within a month or two. They argue that preferences for methodological hardness in economics make it difficult to publish longtermist research within economics. 

To be fair, economics as a field is widely regarded as one of the strongest and most successful in academia right now. There are definitely challenges, of which Bob and the Noy's identify some, but I don't want to give the impression that academics are too down on economics as a field. They've been killing it since the mid-20th century.

I also don't think economists are crazy to think that their top journals are sometimes more rigorous than leading interdisciplinary journals, which can be a bit more headline-chasing and often don't give authors enough words to be fully rigorous.

Honestly I agree. Titotal’s work is fabulous (and I’m happy to use that adjective) but it’s usually not a good idea to put someone’s hackles up when you want to tell them they are wrong.

I downvoted this because of the skepticism of academic rigor and the weakness of the quantum physics analogy. 

Quantum physics is a paradigmatic example of a major theoretical advance developed by academics and accepted by academics after rigorous testing. The reason why these standards were applied is that there were (and still are) any number of theories in fundamental physics which turn out to be false, and it is important to use reasons and evidence to determine whether they are true. 

I am happy to see academics within the nascent AI safety space working towards more traditional and rigorous academic standards. These standards exist for a good reason and I have every expectation that they will continue to serve us well.

Please come back. I can't say that we agree on very much, but you are often a voice of reason and your voice will be missed.

Thanks Jonas! I really appreciate your constructive engagement. 

I’m not very sympathetic to pure time preference – in fact, most philosophers hate pure time preference and are leading the academic charge against it. I’m also not very sympathetic to person-affecting views, though many philosophers whom I respect are more sympathetic. I don’t rely on either of these views in my arguments, because I think they are false.


In general, my impression on the topic of advancing progress is that longtermists have not yet identified plausible interventions which could do this well enough to be worth funding. There has certainly been a lot written about the value of advancing progress, but not many projects have been carried out to actually do this. That tends to suggest to me that I should hold off on commenting about the value of advancing progress until we have more concrete proposals on the table. It also tends to suggest that advancing progress may be hard. 

I think that covers most of the content of Toby’s posts (advancing progress and temporal discounting). Perhaps one more thing that would be important to stress is that I would really like to see Oxford philosophers following the GPI model of putting out smaller numbers of rigorous academic papers rather than larger numbers of forum posts and online reports. It’s not very common for philosophers to engage with one another on internet forums – such writing is usually regarded as a form of public outreach. My first reaction to EA Forum posts by Toby and others would be that I’d like to see them written up as research papers. When ideas are written up online before they are published in academic venues, we tend to be skeptical that the research has actually been done or that it would hold up to scrutiny if it were.

On power-seeking, one of the most important features of academic papers is that they aim to tackle a single topic in depth, rather than a broad range of topics in less depth. So for example, my power-seeking paper does not engage with evolutionary arguments by Hendrycks and others because they are different arguments and would need to be addressed in a different paper.

To be honest, my selection of arguments to engage with is largely driven by the ability of those making the arguments to place them in top journals and conferences. Academics tend to be skeptical of arguments that have not cleared this bar, and there is little career value in addressing them. The reason why I addressed the singularity hypothesis and power-seeking is that both have had high-profile backing by quality authors (Bostrom, Chalmers) or in high-profile conferences (NeurIPS). I’d be happy to engage with evolutionary arguments when they clear this bar, but they’re a good ways off yet and I’m a bit skeptical that they will clear it. 

I don’t want to legislate a particular definition of power, because I don’t think there’s a good way to determine what that should be. My concern is with the argument from power-seeking. This argument needs to use a notion of power such that AI power-seeking would lead to existentially catastrophic human disempowerment. We can define power in ways that make power-seeking easier to establish, but this mostly passes the buck, since we will then need to argue why power-seeking in this sense would lead to existentially catastrophic human disempowerment.

One example of this kind of buck-passing is in the Turner et al. papers. For Turner and colleagues, power is (roughly) the ability to achieve valuable outcomes or to promote an agent’s goals. It’s not so hard to show that agents will tend to be power-seeking in this sense, but this also doesn’t imply that the results will be existentially catastrophic without further argument, and that argument will probably take us substantially beyond anything like the traditional argument from power-seeking. 

Likewise, we might, as you suggest, think about power-seeking as free-energy minimization and adopt a view on which most or all systems seek to minimize free energy. But precisely because this view takes free energy minimization to be common, it won’t put us very far on the way towards arguing that the results will be existentially catastrophic, nor even that they will involve human disempowerment (in the sense of failure to minimize free energy). 

I agree with you that it is unfair to conclude much from the failure of MIRI arguments. The discussion of the MIRI paper is cut from the version of the power-seeking paper that I submitted to a journal. Academic readers are likely to have had high credence that the MIRI argument was bad before reading it, so they will not update much on the failure of this argument. I included this argument in the extended version of the paper because I know that some effective altruists were moved by it, and also because I didn’t know of any other formal work that broke substantially from the Turner et al. framework. I think that the MIRI argument is not anywhere near as good as the Turner et al. argument. While I have many disagreements with Turner et al., I want to stress that their results are mathematically sophisticated and academically rigorous. The MIRI paper is netiher. 

I hope that helps! 

Thanks! 

My discussion of Alexander is based on a number of sources. One of the primary sources is the email that you mention, though I do not think it is accurate to say that this source forms almost the entire basis of my case.

I do not publish documents whose authenticity I have significant reason to doubt, and in this case I excluded several documents because I could not confirm their authenticity to my satisfaction. In the case of the email that you discuss, my reasons for posting the email are as follows.

First, to my knowledge Alexander has never denied authorship of the email. This would, presumably, be a natural first step for someone who did not write it. Second, I asked publicly for credible denials by anyone, not just Alexander, of the authenticity of the email and found none. Third, I wrote personally to Alexander and offered him the opportunity to comment on the authenticity of the email. Alexander declined.

I am surprised by your comments about disclaimers. My intention was to include enough disclaimers to make a tax lawyer blush. For example:

First, The quote which leads the article does include a disclaimer: it is cited as stemming from an "(Alleged) email to Topher Brennan". 

Second, The preliminary notes to the article directly express my attitudes towards authenticity, and offer to publicly apologize if I am wrong in attributing views to Alexander or anyone else. (So far, no one has offered any evidence that I was wrong, or even attempted to offer any sort of denial, evidenced or not.). I wrote: "A note about authenticity. I have done my best to verify that the cited views in this post are the work of the authors they are attributed to. This is difficult because Alexander has alleged that some of the most troubling comments attributed to him are not his. My policy has been to stick with the sources whose authenticity I am most confident in, and to the best of my knowledge the authenticity of all passages quoted in this post has not been publicly disputed by their alleged authors. (Alexander declined a request for comment on the authenticity of passages quoted in this post). However, I am conscious of the possibility of error. While I have closed comments on all posts in this series, I am highly open to correction on any matters of fact, and if I am wrong in attributing any views to an individual I will correct them and publicly apologize. You can reach me at reflectivealtruismeditor@gmail.com."

Third, immediately before presenting Alexander's email I write: "I am not able to confirm the veracity of the email, so readers are asked to use their own judgment, but I am also not aware of credible attempts by anyone, including Alexander, to deny the veracity of this email (Alexander declined to comment on the veracity of the email. I did check for other credible attempts to deny its veracity.)."

These are not the only disclaimers included in the article, but I think that they are more than adequate and included in the proper places. 

You seem skeptical of some of the implications that I draw from this email. I quote the email in full, then draw six implications, giving evidence for each. If you are concerned about some of these implications, you are of course welcome to say which implications you are concerned about and give reasons to think that they do not follow. 

I do acknowledge that Alexander has pushed back against some of the worst excesses of race science, including neoreaction. For example, I wrote: "I would be remiss if I did not remind readers that not all of Alexander’s contributions to this space are bad. Alexander has penned one of the most successful and best-known criticisms of the excesses of the reactionary movement, in the form of his 2013 “Anti-reactionary FAQ“, which has inspired useful follow-ups from others. Alexander does write some posts supportive of the cause of social justice, such as his “Social justice for the highly demanding of rigor,” though the implication that rigor is not already present in discussions of social justice may not be appreciated by all readers. Alexander has drawn attention to the real and important phenomenon of sexual harassment perpetrated by women, or against men. And we will see below that Alexander calls out Richard Hanania for his strange obsession with office romance." 

All of these behaviors are consistent with the conclusions that I draw in the article. I do not, and will not say that Alexander has never done anything to push back against racism or sexism. That would be a bald-faced lie. It is possible to push back against some bad behaviors and beliefs on these fronts while engaging in other troubling behaviors and holding other troubling beliefs. 

Thanks Toby!

Maybe biorisk or the time of perils could be good topics for a debate week?

Load more