Rhetorical Consolidation of the X-Risk Debate

I am working on a research project as part of a course organised by BlueDot Impact. Submission is on the 19th of November but I intend to continue working on the project after that.

I am trying to figure out if there is a rational/argumentative asymmetry between the proponents of AI X-Risk and the detractors (my working hypothesis is that there is one, in favor of the proponents). At the same time I hope to build a hub of collaboration and a reference to analyse the arguments on each side.

I have been putting together:

A “model” of X-Risk with 4 hypotheses. Each hypothesis makes the model stronger in terms of predictive force, but easier to falsify
- H1: ASI will be created soon (will refine the definitions of “ASI” and “soon”)
- H2: ASI will be able to wipe out humanity
- H3: ASI will have incentives to wipe out humanity
- H4: We won’t find a solution to this by the time ASI is created
Taxonomies of arguments that falsify each hypothesis
Taxonomies of arguments in support of each hypothesis
Epistemic consolidations at each node of the taxonomies (some analysis of the arguments in favor, rebuttals, and a simple conclusion: debunked, unlikely, likely)

I have been collating resources (research papers, blog articles, etc) debating AI X-risk, extracting arguments from these resources and organizing them in taxonomies.

[Hitting 2000 words limit, cutting post in two]

dead-brownOP•11/13/23, 11:19 AM

[Continuation]

My methodology aspires to be unbiased, but my hope is that the overall argumentative weakness of the detractors side emerges. Or maybe we find out that there is no risk after all and we can all live happily ever after. Yay.

A paper is on the way, but the final deliverable should be a collaborative webpage where people can add new arguments, classify them in the taxonomies, update the taxonomies, and analyse them.

I do all of my work publicly, on Notion. So you can check it out here if you are interested, and leave comments:
https://maximolog.notion.site/A-Tour-of-the-AI-Safety-Rhetorical-Landscape-44cde3da286e45079d9e287ab09d5f72
Any feedback is extremely welcome. Also, if anyone wants to take part, or fund the project, please reach out!

Help needed with:

General feedback on the project, goals, impact, methodology
Feedback on the research paper (first draft coming hopefully tonight)
Feedback on the x-risk model and taxonomies of arguments
Writing analysis of arguments pro and against X-Risk
Writing synthesis for argument families in the taxonomy

The last two are particularly fun activities if you like rhetoric, rationality and debunking LeCunn's nonsense

urgent-maroon•11/13/23, 11:33 AM

my only comment so far is that ASI will have incentives to wipe out humanityASI will have incentives to wipe out humanity is a better formulation than ASI will want to disempower humanityASI will want to disempower humanity

urgent-maroon•11/13/23, 11:33 AM

and also that the effort is worthwhile so I look forward to re-reading when there is more content

dead-brownOP•11/13/23, 11:42 AM

good point, I have been wondering whether I should focus on disempowerment or extinction. If I focus on disempowerment there is then an additional gap/hypothesis of "disempowerment leads to extinction". For this reason I have decided to go with extinction but haven't reflected that update on the landing page yet

urgent-maroon•11/13/23, 11:50 AM

it depends a bit who your audience is - if you're hoping to present this to laypeople or foundational knowledge level then you might want to keep extinctionextinction as the main endpoint as it is simpler and sends a clear message, if you're aiming for a more specialized audience with knowledge of the territory then disempowermentdisempowerment or loss of control over the futureloss of control over the future offer more nuance - to be honest there's worse outcomes than extinction imo in that category so it can serve as a catch-all for all forms of horribleness or less-horribleness that may result from that conclusion (you can maybe reserve a sub-heading to detail the potential varianta of loss of control

dead-brownOP•11/13/23, 11:53 AM

Good point again. I don't intend to take terminal decisions on anything at the moment, my idea is to propose a framework, with some initial model, taxonomies, etc, and invite people to refine everything, so I expect the model to change after I submit my paper.

dead-brownOP•11/13/23, 11:54 AM

A possible alternative could be

ASI will be created soon
ASI will be able to disempower humans
ASI will have incentives to do so
Disempowering humanity is a bad outcome for humans
We won't solve this in time

hurt-tomato•11/13/23, 4:55 PM

maybe relevant: I made a small webapp to navigate arguments and outcomes. Very similar hypothesis!

hurt-tomato•11/13/23, 4:55 PM

https://pauseai.info/outcomes

dead-brownOP•11/13/23, 4:58 PM

Oh wow that's a great little webapp, I'll add it to my list of resources to send to people new to AI X-Risk

dead-brownOP•11/13/23, 5:03 PM

My first thought is that it does not include a time component. I like the way you dodged that by talking about continuing to build until we reach ASI, but I know many people (mostly experts) who shrug off X-Risk by saying LLMs wont scale and we still have plenty of time before there is any risk

hurt-tomato•11/13/23, 6:48 PM

entirely true

colossal-harlequin•11/13/23, 11:10 PM

Linking awfully located thread https://discord.com/channels/1100491867675709580/1172206894681706517/1172312569185304617 where I note my current reason to be interested in threat models and assigning probabilities, despite my doubts it is tenable.

https://pauseai.info/outcomes is fun and huge props @Joep Meindertsma for building something!

But multiplying point estimate probabilities demonstrates one of the problems. Did you play at all with expressing and combining probability ranges/distributions?

Ccolossal-harlequin Linking awfully located thread https://discord.com/channels/1100491867675709580/...

dead-brownOP•11/14/23, 8:44 AM

Thanks for linking! I noticed your post yesterday and was reviewing all your links to add them to my analysis

hurt-tomato•11/14/23, 9:00 AM

Suggestion: add to your OP a list of things where other people might be able to help out with!

dead-brownOP•11/14/23, 2:06 PM

Help needed with:

General feedback on the project, goals, impact, methodology
- Feedback on the research paper (first draft coming hopefully tonight)
- Feedback on the x-risk model and taxonomies of arguments
- Writing analysis of arguments pro and against X-Risk
- Writing synthesis for argument families in the taxonomy

The last two are particularly fun activities if you like rhetoric, rationality and debunking LeCunn's nonsense

colossal-harlequin•11/16/23, 12:20 AM

You asked for feedback on your abstract: https://docs.google.com/document/d/1SvuoaXOF3WJIvYW9YQ_W-MSVQtNFtaaHfjR2T2NvT6s/edit?usp=sharing

If meant as the abstract for a published paper it is rather long and loose. As a sketch of content, useful.

High level: Can you specifically motivate why you need a new framework, and a unique methodology re arguments? What are you planning to do differently? You listed what the methodology involves, but the new idea didn't jump out at me.

Minutiae: not sure why "existential risk" and "x-risk" are getting capitalized? And despite language evolving, I wouldn't use "beg the question" that way in the context of a formal paper.

Really looking forward to reading this thing by the way!

Google Docs

Abstract - Beyond Bias and Fallacies

Working Title: Beyond Bias and Fallacies: Structuring the AI Existential Risk Debate In May 2023, hundreds of prominent AI scientists signed the following statement: “Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and ...

dual-salmon•11/16/23, 2:00 AM

https://www.figma.com/file/q1jTwOez837RG3SjGcfvlv/AI-Safety-Debate?type=whiteboard&node-id=0-1&t=QJMmQfQykmTOSdZx-0

Here's something I wrote up that you might find helpful as a starting point

FigJam

AI Safety Debate

Created with FigJam

Ddual-salmon https://www.figma.com/file/q1jTwOez837RG3SjGcfvlv/AI-Safety-Debate?type=whiteboa...

dead-brownOP•11/16/23, 12:43 PM

thank you that's awesome!

Ccolossal-harlequin You asked for feedback on your abstract: https://docs.google.com/document/d/1Svu...

dead-brownOP•11/16/23, 12:45 PM

Thanks a lot for the feedback! Interesting point about the motivation, I was thinking that this wasn't meant to be in the abstract, would have put it in its own section. Thanks for the minutiae, as a non native english speaker I sometime miss these things.
What do you mean by "rather loose"?

colossal-harlequin•11/16/23, 6:15 PM

Just that abstracts in technical papers tend to be super dense and succinct, "tightly worded". Happy to suggest such edits to what you write if useful.

This is mostly a style / signaling /convention though.

dead-brownOP•11/16/23, 6:18 PM

Thanks, I wrote a new version, condensed to ~250 words, and tried to follow your advice by tuning down the claims of novelty. Essentially, the framework is only "new" in the sense that I couldn't find something similar: a taxonomy with a synthesis at every node, which allows for the user to get an understanding of the debate at any level of granularity they wish

colossal-harlequin•11/17/23, 12:13 AM

I think that abstract is great now, and remain keen to see the framework in more detail.

I posted some more links to previous models in https://discord.com/channels/1100491867675709580/1172206894681706517/1174863868317679777 and think these ones may be of particular interest given what you are up to.

Ccolossal-harlequin I think that abstract is great now, and remain keen to see the framework in more...

dead-brownOP•11/18/23, 8:06 PM

Here is my white paper (or research paper, or whatever) https://maximolog.notion.site/Beyond-Bias-and-Fallacies-Fixing-the-AI-Safety-Debate-b871b1a370414d358209001114404507

Maxime Fournes's Notion on Notion

Beyond Bias and Fallacies: Fixing the AI Safety Debate

Abstract

dead-brownOP•11/18/23, 8:07 PM

Any feedback would be great

colossal-harlequin•11/19/23, 12:40 AM

Fantastic news. How long do we have to give you feedback before you start widely advertising it?

colossal-harlequin•11/19/23, 1:23 AM

Is your notion of Epistemic Status really an enum, rather than (say) a [0, 1] likelihood?

colossal-harlequin•11/19/23, 1:44 AM

I'll risk assuming "yes". Some challenges here: there aren't many choices; "Unlikely" for an x-risk should still imply it badly needs attention; I'm unclear how you intend to aggregate a status across folk who disagree.

dead-brownOP•11/19/23, 9:08 AM

yes

dead-brownOP•11/19/23, 9:09 AM

very good point. I wrote some quick guidelines for deciding the status but they would need to be fleshed out

Ccolossal-harlequin Fantastic news. How long do we have to give you feedback before you start widely...

dead-brownOP•11/19/23, 9:13 AM

about a week. My timelines on this project are:

today, paper submitted to BlueDot Impact project contest
this week, gathering feedback, building a simple collaborative interface in Notion, building some AI automation with GPTs
week after, promotion, post about the project on EA forum, LW, etc. Try to find collaborators
Ideally find someone to handover this project to while I disappear in December
In January, start to build the actual Collaborative Hub

Rhetorical Consolidation of the X-Risk Debate

Similar Threads

Similar Threads

Similar Threads