EA Forum Bot Site
EA Forum

Hide table of contents

Comment Permalink

Answer by eliflandSep 15, 202215

1

1

Epistemic status: Exploratory

My overall chance of existential catastrophe from AI is ~50%.

My split of worlds we succeed is something like:

10%: Technical alignment ends up not being that hard, i.e. if you do common-sense safety efforts you end up fine.
20%: We solve alignment mostly through hard technical work, without that much aid from governance/coordination/etc. and likely with a lot of aid from weaker AIs to align stronger AIs.
20%: We solve alignment through lots of hard technical work but very strongly aided by governance/coordination/etc. to slow down and allow lots of time spent with systems that are useful to study and apply for aiding alignment, but not too scary to cause an existential catastrophe.

Timelines probably don’t matter that much for (1), maybe shorter timelines hurt a little. Longer timelines probably help to some extent for (2) to buy time for technical work, though I’m not sure how much as under certain assumptions longer timelines might mean less time with strong systems. One reason I’d think they matter for (2) is it buys more time for AI safety field-building, but it’s unclear to me how this will play out exactly. I’m unsure about the sign of extending timelines for the promise of (3), given that we could end up in a more hostile regime for coordination if the actors leading the race weren’t at all concerned about alignment. I guess I think it’s slightly positive given that it’s probably associated with more warning shots.

So overall, I think timelines matter a fair bit but not an overwhelming amount. I’d guess they matter most for (2). I’ll now very roughly translate these intuitions into forecasts for the chance of AI-caused existential catastrophes conditional on arrival date of AGI (in parentheses I’ll give a rough forecast for AGI arriving during this time period):

Before 2025: 65% (1%)
Between 2025 and 2030: 60% (8%)
Between 2030 and 2040: 55% (28%)
Between 2040 and 2060: 50% (25%)
After 2060: 45% (38%)

Multiplying out and adding gives me 50.45% overall risk, consistent with my original guess of ~50% total risk.

0

GideonF

Sep 15 2022

This seems to imply that you couldn't get misaligned AGI arriving and it not causing an existential catastrophe. Is that correct?

eliflandSep 15 20224

1

0

I didn't mean to imply that. I think we very likely need to solve alignment at some point to avoid existential catastrophe (since we need aligned powerful AIs to help us achieve our potential), but I'm not confident that the first misaligned AGI would be enough to cause this level of catastrophe (especially for relatively weak definitions of "AGI").

[ Question ]

Forecasting thread: How does AI risk level vary based on timelines?

Sep 14 20221 min read2 answers 4

47

AI safetyForecastingAI forecastingConsequences of AI timelines

Forecasting thread: How does AI risk level vary based on timelines?

Epistemic status: Exploratory

One of my friends and collaborators did this app which is aimed at pre...

Crossposted to LessWrong

While there have been many previous surveys asking about the chance of existential catastrophe from AI and/or AI timelines, none as far as I'm aware have asked about how the level of AI risk varies based on timelines. But this seems like an extremely important parameter for understanding the nature of AI risk and prioritizing between interventions.

Contribute your forecasts below. I'll write up my forecast rationales in an answer and encourage others to do the same.

1%

2%

3%

4%

5%

6%

7%

8%

9%

nil (3%),Quintin Pope (7%)

10%

11%

12%

13%

14%

15%

16%

17%

18%

19%

Joel Becker (11%),Katja_Grace (14%),weeatquince (15%)

20%

21%

22%

23%

24%

25%

26%

27%

28%

29%

Conor Sullivan (20%),sbowman (25%),matthew.vandermerwe (28%)

30%

31%

32%

33%

34%

35%

36%

37%

38%

39%

Peter Wildeford (36%)

40%

41%

42%

43%

44%

45%

46%

47%

48%

49%

50%

51%

52%

53%

54%

55%

56%

57%

58%

59%

Kakili (50%),Richard Korzekwa (59%)

60%

61%

62%

63%

64%

65%

66%

67%

68%

69%

Benjamin Rachbach (60%),elifland (65%)

70%

71%

72%

73%

74%

75%

76%

77%

78%

79%

Ryan Greenblatt (73%),tamera (75%),iamthouthouarti (75%),Ash Jafari (75%),Jaime Sevilla (78%),Thomas Kwa (78%)

80%

81%

82%

83%

84%

85%

86%

87%

88%

89%

Lukas Finnveden (80%),aog (82%),Vivek (84%),Quadratic Reciprocity (85%),peterbarnett (87%),Mckiev 🔸 (87%)

90%

91%

92%

93%

94%

95%

96%

97%

98%

99%

Miranda_Zhang (90%),Will Aldred (90%),Morpheus (91%),HjalmarWijk (92%),joshuatanderson (92%),Habryka [Deactivated] (92%),Jeffrey Ladish (94%),kokotajlod (95%),VermillionStuka (95%),Jide (95%),MinusGix (95%),Daniel Kokotajlo (95%),Primer (95%),Self_Optimization (99%),simeon_c (99%),Sharmake Farah (99%),Sharmake (99%),Raphaël Lévy (99%),KyleGracey (99%)

1%

Conditional on AGI arriving by Jan 1 2025, will there be an AI-caused existential catastrophe?

99%

1%

2%

3%

4%

5%

6%

7%

8%

9%

nil (3%),Quintin Pope (6%)

10%

11%

12%

13%

14%

15%

16%

17%

18%

19%

Joel Becker (10%),weeatquince (12%),Katja_Grace (12%),Conor Sullivan (15%),sbowman (18%)

20%

21%

22%

23%

24%

25%

26%

27%

28%

29%

30%

31%

32%

33%

34%

35%

36%

37%

38%

39%

matthew.vandermerwe (36%),Richard Korzekwa (39%)

40%

41%

42%

43%

44%

45%

46%

47%

48%

49%

Peter Wildeford (44%)

50%

51%

52%

53%

54%

55%

56%

57%

58%

59%

Lukas Finnveden (50%),Ryan Greenblatt (53%),Benjamin Rachbach (55%),iamthouthouarti (55%)

60%

61%

62%

63%

64%

65%

66%

67%

68%

69%

Kakili (60%),elifland (60%),Thomas Kwa (63%),Ash Jafari (63%),tamera (67%),aog (68%)

70%

71%

72%

73%

74%

75%

76%

77%

78%

79%

Quadratic Reciprocity (70%),Mckiev 🔸 (74%),Vivek (75%),Jaime Sevilla (75%),Ian McKenzie (79%)

80%

81%

82%

83%

84%

85%

86%

87%

88%

89%

Miranda_Zhang (82%),Will Aldred (83%),MinusGix (85%),Morpheus (85%),HjalmarWijk (88%)

90%

91%

92%

93%

94%

95%

96%

97%

98%

99%

Daniel Kokotajlo (90%),kokotajlod (90%),joshuatanderson (90%),Jide (90%),Habryka [Deactivated] (92%),Jeffrey Ladish (93%),KyleGracey (95%),peterbarnett (95%),Primer (96%),Self_Optimization (98%),Raphaël Lévy (99%),Sharmake (99%),Sharmake Farah (99%)

1%

Conditional on AGI arriving between Jan 1 2025 and Jan 1 2030, will there be an AI-caused existential catastrophe?

99%

1%

2%

3%

4%

5%

6%

7%

8%

9%

nil (3%),Quintin Pope (5%),weeatquince (8%)

10%

11%

12%

13%

14%

15%

16%

17%

18%

19%

Joel Becker (10%),Conor Sullivan (10%),Katja_Grace (11%),sbowman (15%)

20%

21%

22%

23%

24%

25%

26%

27%

28%

29%

Kakili (26%),Richard Korzekwa (27%)

30%

31%

32%

33%

34%

35%

36%

37%

38%

39%

Lukas Finnveden (35%),Peter Wildeford (39%)

40%

41%

42%

43%

44%

45%

46%

47%

48%

49%

iamthouthouarti (40%),Ryan Greenblatt (41%),aog (47%)

50%

51%

52%

53%

54%

55%

56%

57%

58%

59%

Ash Jafari (50%),tamera (50%),Benjamin Rachbach (52%),Thomas Kwa (53%),elifland (55%)

60%

61%

62%

63%

64%

65%

66%

67%

68%

69%

Quadratic Reciprocity (60%),Jaime Sevilla (60%),Morpheus (63%),Ian McKenzie (65%),Vivek (65%),MinusGix (65%)

70%

71%

72%

73%

74%

75%

76%

77%

78%

79%

HjalmarWijk (70%),Miranda_Zhang (72%),Mckiev 🔸 (74%),peterbarnett (75%),Will Aldred (75%),Raphaël Lévy (75%),Sharmake Farah (75%),Jide (75%),Daniel Kokotajlo (75%),kokotajlod (75%),simeon_c (78%)

80%

81%

82%

83%

84%

85%

86%

87%

88%

89%

KyleGracey (80%),joshuatanderson (81%),Sharmake (82%),VermillionStuka (85%),Jeffrey Ladish (87%),Habryka [Deactivated] (89%)

90%

91%

92%

93%

94%

95%

96%

97%

98%

99%

Self_Optimization (90%),Primer (92%),Benjy Forstadt (96%)

1%

Conditional on AGI arriving between Jan 1 2030 and Jan 1 2040, will there be an AI-caused existential catastrophe?

99%

1%

2%

3%

4%

5%

6%

7%

8%

9%

nil (2%),weeatquince (4%),Quintin Pope (4%),Katja_Grace (8%),Conor Sullivan (8%),Joel Becker (9%)

10%

11%

12%

13%

14%

15%

16%

17%

18%

19%

Kakili (15%),Richard Korzekwa (18%)

20%

21%

22%

23%

24%

25%

26%

27%

28%

29%

tamera (20%),Lukas Finnveden (25%),iamthouthouarti (25%),aog (25%),Peter Wildeford (26%)

30%

31%

32%

33%

34%

35%

36%

37%

38%

39%

Ryan Greenblatt (30%),MinusGix (35%),sbowman (35%)

40%

41%

42%

43%

44%

45%

46%

47%

48%

49%

peterbarnett (43%),Thomas Kwa (43%),joshuatanderson (44%),HjalmarWijk (45%),Morpheus (46%),Vivek (47%),Benjamin Rachbach (47%)

50%

51%

52%

53%

54%

55%

56%

57%

58%

59%

kokotajlod (50%),Raphaël Lévy (50%),Miranda_Zhang (50%),elifland (50%),Daniel Kokotajlo (50%),Quadratic Reciprocity (50%),Jide (50%),Jaime Sevilla (53%),Sharmake Farah (58%),Ian McKenzie (59%)

60%

61%

62%

63%

64%

65%

66%

67%

68%

69%

KyleGracey (60%),Sharmake (60%),VermillionStuka (65%),Will Aldred (67%)

70%

71%

72%

73%

74%

75%

76%

77%

78%

79%

Self_Optimization (70%),Mckiev 🔸 (74%)

80%

81%

82%

83%

84%

85%

86%

87%

88%

89%

Primer (84%),Habryka [Deactivated] (87%)

90%

91%

92%

93%

94%

95%

96%

97%

98%

99%

1%

Conditional on AGI arriving between Jan 1 2040 and Jan 1 2060, will there be an AI-caused existential catastrophe?

99%

1%

2%

3%

4%

5%

6%

7%

8%

9%

nil (2%),Katja_Grace (4%),Quintin Pope (4%),weeatquince (5%),Conor Sullivan (5%),Kakili (9%),Joel Becker (9%)

10%

11%

12%

13%

14%

15%

16%

17%

18%

19%

tamera (10%),iamthouthouarti (10%),Richard Korzekwa (14%),aog (15%)

20%

21%

22%

23%

24%

25%

26%

27%

28%

29%

Lukas Finnveden (20%),MinusGix (20%),Sharmake Farah (21%),dgr (24%),HjalmarWijk (25%),Thomas Kwa (25%),Peter Wildeford (25%),Self_Optimization (25%),Ryan Greenblatt (28%)

30%

31%

32%

33%

34%

35%

36%

37%

38%

39%

peterbarnett (34%),Vivek (34%),Morpheus (35%),Miranda_Zhang (36%)

40%

41%

42%

43%

44%

45%

46%

47%

48%

49%

joshuatanderson (40%),Jaime Sevilla (40%),Sharmake (40%),Primer (41%),elifland (45%),Jide (45%),Benjamin Rachbach (45%),Quadratic Reciprocity (45%),sbowman (45%)

50%

51%

52%

53%

54%

55%

56%

57%

58%

59%

Daniel Kokotajlo (50%),kokotajlod (50%),KyleGracey (50%),Raphaël Lévy (50%),Ian McKenzie (52%),Mckiev 🔸 (53%),Will Aldred (55%)

60%

61%

62%

63%

64%

65%

66%

67%

68%

69%

VermillionStuka (60%),matthew.vandermerwe (63%)

70%

71%

72%

73%

74%

75%

76%

77%

78%

79%

Jeffrey Ladish (72%),Habryka [Deactivated] (75%)

80%

81%

82%

83%

84%

85%

86%

87%

88%

89%

90%

91%

92%

93%

94%

95%

96%

97%

98%

99%

1%

Conditional on AGI arriving after Jan 1 2060, will there be an AI-caused existential catastrophe?

99%

47

0

0

Reactions

0

0

Mentioned in

34EA & LW Forums Weekly Summary (12 - 18 Sep 22’)

23It’s not obvious that getting dangerous AI later is better

New Answer

New Comment

2 Answers sorted by
Top

Sep 15, 2022

15

1

1

Epistemic status: Exploratory

My overall chance of existential catastrophe from AI is ~50%.

My split of worlds we succeed is something like:

10%: Technical alignment ends up not being that hard, i.e. if you do common-sense safety efforts you end up fine.
20%: We solve alignment mostly through hard technical work, without that much aid from governance/coordination/etc. and likely with a lot of aid from weaker AIs to align stronger AIs.
20%: We solve alignment through lots of hard technical work but very strongly aided by governance/coordination/etc. to slow down and allow lots of time spent with systems that are useful to study and apply for aiding alignment, but not too scary to cause an existential catastrophe.

Timelines probably don’t matter that much for (1), maybe shorter timelines hurt a little. Longer timelines probably help to some extent for (2) to buy time for technical work, though I’m not sure how much as under certain assumptions longer timelines might mean less time with strong systems. One reason I’d think they matter for (2) is it buys more time for AI safety field-building, but it’s unclear to me how this will play out exactly. I’m unsure about the sign of extending timelines for the promise of (3), given that we could end up in a more hostile regime for coordination if the actors leading the race weren’t at all concerned about alignment. I guess I think it’s slightly positive given that it’s probably associated with more warning shots.

So overall, I think timelines matter a fair bit but not an overwhelming amount. I’d guess they matter most for (2). I’ll now very roughly translate these intuitions into forecasts for the chance of AI-caused existential catastrophes conditional on arrival date of AGI (in parentheses I’ll give a rough forecast for AGI arriving during this time period):

Before 2025: 65% (1%)
Between 2025 and 2030: 60% (8%)
Between 2030 and 2040: 55% (28%)
Between 2040 and 2060: 50% (25%)
After 2060: 45% (38%)

Multiplying out and adding gives me 50.45% overall risk, consistent with my original guess of ~50% total risk.

GideonFSep 15 20220

0

2

This seems to imply that you couldn't get misaligned AGI arriving and it not causing an existential catastrophe. Is that correct?

4

elifland

Sep 15 2022

I didn't mean to imply that. I think we very likely need to solve alignment at some point to avoid existential catastrophe (since we need aligned powerful AIs to help us achieve our potential), but I'm not confident that the first misaligned AGI would be enough to cause this level of catastrophe (especially for relatively weak definitions of "AGI").

Sep 18, 2022

4

0

0

One of my friends and collaborators did this app which is aimed at predicting the likelihood that we go extinct: https://xriskcalculator.vercel.app/

It might be useful!

Comments4

Sorted by

Click to highlight new comments since: Today at 4:12 AM

kokotajlodOct 10 20224

0

0

This is now a Metaculus question.

eliflandSep 20 20224

0

0

FYI: You can view community median forecasts for each question at this link. Currently it looks like:

kokotajlodSep 15 20222

0

0

Oops, accidentally voted twice on this. Didn't occur to me that the LW and EAF versions were the same underlying poll.

Roddy MacSweenSep 17 20221

0

0

What definition of AGI are you using?

More from elifland

Curated and popular this week

Relevant opportunities