Hide table of contents

Comment Permalink

Yeah, you should talk to someone who knows more about security than myself, but as a couple starting points;

math-proven safe AIs

This is not a thing, and likely cannot he a thing. You can't prove an AI system isn't malign, and work that sounds like it says this is actually doing something very different.

You can do everything you do now, even buy or rent GPUs, all of them just will be cloud math-proven safe GPUs

You can't know that a given matrix multiplication won't be for an AI system. It's the same operation, so if you can buy or rent GPU time, how would it know what you are doing?

See in context

[ Question ]

Share AI Safety Ideas: Both Crazy and Not. №2

by ank

Mar 311 min read2 answers 0

1

AI safetyBuilding effective altruismExistential riskForecastingPolicyCriticism of effective altruist organizationsThreads

Frontpage

Share AI Safety Ideas: Both Crazy and Not. №2

A quick request: Let’s keep this space constructive—downvote only if there’s clear trolling or spam, and be supportive of half-baked ideas. The goal is to unlock creativity, not judge premature thoughts.

Answers

ank

A major problem with the setup of our forum: currently it's possible t...

-1

ank

The only complete and comprehensive solution that makes AIs 100% safe:...

No comments

AI safety is one of the most critical issues of our time, and sometimes the most innovative ideas come from unorthodox or even "crazy" thinking. I’d love to hear bold, unconventional, half-baked or well-developed ideas for improving AI safety. You can also share ideas you heard from others.

Let’s throw out all the ideas—big and small—and see where we can take them together.

Feel free to share as many as you want! No idea is too wild, and this could be a great opportunity for collaborative development. We might just find the next breakthrough by exploring ideas we’ve been hesitant to share.

Looking forward to hearing your thoughts and ideas!

P.S. You answer can potentially help people with their career choice, cause prioritization, building effective altruism, policy and forecasting.

P.P.S. AIs are moving quick, so we need new ideas to make them safe, you can compare the ideas here with the ones we had last month.

1 Reactions

New Answer

New Comment

2 Answers sorted by
Top

ank

Mar 31, 2025*

A major problem with the setup of our forum: currently it's possible to go here or to any post and double downvote all the new posts in mass (please, don't! Just believe me it's possible), so writers who were writing their AI safety proposal for months will have their posts ruined (almost no one reads posts with negative rating), will probably abandon our cause and go become evil masterminds.

Solution: UI proposal to solve the problem of demotivating writers, helps to teach writers how to improve their posts (so it makes all the posts better), it keeps the downvote buttons, increases signal to noise ratio on the site because both authors and readers will have information why the post was downvoted:

It’s important to ask for reasons why a downvoter downvotes if the downvote will move the post below zero in karma. The author was writing something for maybe months, the downvoter if it’s important enough to downvote, will be able to spend an additional moment to choose between some popular reasons to downvote (we have a lot of space on the desktop, we can put the most popular reasons for downvotes as buttons like Spam, Bad Title, Many Tags, Type...) or to choose some reason later on some special page. Else the writer will have no clue, will rage quit and become Sam Altman instead.

More serious now: On desktop we have a lot of space and can show buttons like this: Spam (this can potentially be a big offense, bigger than 1 downvote), Typo (they probably shouldn't lower the post as much as a full downvote), Too Many Tags, basically the popular reasons people downvote. We can make those buttons appear when hovering on the downvote button or always. This way people still click once to downvote like before.
Especially if a downvoter downvoted so much, the post now has negative karma, we show a bubble in a corner for 30 seconds that says something like this: "Please choose one of those other popular reasons people downvote or hover here to type a word or 2 why you double downvoted, it'll help the writer improve."
Downvoters can hover over the downvote button, it’ll hijack their typing cursor so they can quickly type a word or two why they downvote and press enter to downvote. Again very elegant UI.
If they just click downvote (you can still keep this button), show a ballon for 30 secs where a downvoter can choose a popular reasons for downvoting or again hover and instantly type a word or 2 and press enter to downvote
A page “Leave feedback for your downvotes”: where we show downvotes that ruined articles, then double downvotes then the rest. We can say it honestly in our UI: those are new authors and your downvote most likely demotivated them because the article has negative karma now, so please write how can they improve. And we maybe can write that feedback for popular articles (with comments, from established writers and/or above zero karma posts) is not as important, so you don’t need to bother.
Phone UI is similar but we don’t have hovering. So we can still show popular reasons to downvote on top maybe as icons: Spam, Typo, etc and a button to Type and Downvote that looks like a typing cursor & Downvote icon - by tapping it the keyboard autoappears, you enter a word or two and press return to downvote. So again one tap to downvote like before in most cases. We just gave the downvoters more choices to express themselves.
As I said before we need this at least when the downvote will ruin the post - put it into the negative karma territory. This way we prioritizing teaching people how to write better and more, instead of scaring them away to abandon our website. Everybody wins.

Thank you for reading and making the website work!

ank

Mar 31, 2025*

-1

The only complete and comprehensive solution that makes AIs 100% safe: in a nutshell we need to at least lobby politicians to make GPU manufacturers (NVIDIA and others) to make robust blacklists of bad AI models, update GPU firmwares with them. It's not the full solution: please steelman and read the rest to learn how to make it much safer and why it will work (NVIDIA and other GPU makers will want to do it because it'll double their business and all future cash flows. Gov will want it because it removes all AI threats from China, all hackers, terrorists and rogue states):

The elephant in the room: even if current major AI companies will align their AIs, there will be hackers (can create viruses with agentic AI component to steal money), rogue states (can decide to use AI agents to spread propaganda and to spy) and military (AI agents in drones and to hack infrastructure). So we need to align the world, not just the models:
Imagine a agentic AI botnet starts to spread on user computers and GPUs, GPUs are like nukes to be taken, they are not protected from running bad AI models at all. I call it the agentic explosion, it's probably going to happen before the "intelligence-agency" explosion (intelligence on its own cannot explode, an LLM is a static geometric shape - a bunch of vectors - without GPUs). Right now we are hopelessly unprepared. We won't have time to create "agentic AI antiviruses".
To force GPU and OS providers to update their firmware and software to at least have robust updatable blacklists of bad AI (agentic) models. And to have robust whitelists, in case there will be so many unaligned models, blacklists will become useless.
We can force NVIDIA to replace agentic GPUs with non-agentic ones. Ideally those non-agentic GPUs are like sandboxes that run an LLM internally and can only spit out the text or image as safe output. They probably shouldn't connect to the Internet, use tools, or we should be able to limit that in case we'll need to.
This way NVIDIA will have the skin in the game and be directly responsible for the safety of AI models that run on its GPUs.
The same way Apple feels responsible for the App Store and the apps in it, doesn't let viruses happen.
NVIDIA will want it because it can potentially like App Store take 15-30% cut from OpenAI and other commercial models, while free models will remain free (like the free apps in the App Store).
Replacing GPUs can double NVIDIA's business, so they can even lobby themselves to have those things. All companies and CEOs want money, have obligations to shareholders to increase company's market capitalization. We must make AI safety something that is profitable. Those companies that don't promote AI safety should go bankrupt or be outlawed.

DavidmanheimApr 12

In addition to the fundamental problem that we don't know how to tell if models are safe after release, much less in advance, blacklists for software, web sites, etc. historically have been easy to circumvent, for a variety of reasons, effectively all of which seem likely to apply here.

ank

Apr 1

Yes, David, I should start recommending GPUs with internal antiviruses because they have lists + heuristics, they work. We better to have GPUs that are protected right now, before things go bad. Even if we don’t know how to make ideal 100% perfect anti-bad-AI antiviruses on GPU-level, it’s better than 100% unprotected GPUs we have now. It’ll deter some threats, slow down hackers/agentic AI takeover. It can be a start that we’ll build upon as we better understand the threats

ank

Apr 1

Here is a drafty continuation you can find interesting (or not ;): In unreasonable times the solution to AI problem will sounds unreasonable at first. Even though it's probably the only reasonable and working solution. Imagine in a year we solved alignment and even hackers/rogue states cannot unleash AI agents on us. How we did it? 1. The most radical solution that will do it (unrealistic and undesirable): is having international cooperation and destroying all the GPUs, never making them again. Basically returning to some 1990s computer-wise, no 3D video games but everything else is similar. But it's unrealistic and probably stifles innovation too much. 2. Less radical is keeping GPUs so people can have video games and simulations but internationally outlawing all AI and replacing GPUs with the ones that completely not support AI (we may need to have Mathematically Proven Secure GPU OS so GPUs are secure sandboxes that are more locked up then nuclear plants, they cannot run AI models, self destruct in case of tampering, call FBI, need Web access to work, there are ways to make it extremely hard to run AIs on them, like having all the GPUs compute only non-AI things remotely on NVIDIA servers, so computers are like thin clients, and GPU compute is always rented from the cloud, etc.) They can even burn and call some FBI if a person tries to run some AI on it, it's a joke. So like returning to 2020 computer-wise, no AI but everything else the same. 3. Less radical is to have whitelists of models right on GPU, a GPU becomes a secure computer that only works if it's connected to the main server (it can be some international agency, not NVIDIA, because we want all GPU makes, not just NVIDIA to be forced to make non-agentic GPUs). NVIDIA and other GPU providers approve models a bit like Apple approves apps in their App Store. Like Nintendo approves games for her Nintendo Switch. So no agentic models, we'll have non-agentic tool AIs that Max Tegmark recommends: they

Davidmanheim

Apr 2

This would benefit greatly from more in-depth technical discussion with people familiar with the technical, regulatory, and economic issues involved. It talks about a number of things that aren't actually viable as described, and makes a number of assertions that are implausible or false. That said, I think it's directionally correct about a lot of things.

-1

ank

Apr 2

Yes, the only realistic and planet-wide 100% safe solution is this: putting all the GPUs in safe cloud/s controlled by international scientists that only make math-proven safe AIs and only stream output to users. Each user can use his GPU for free from the cloud on any device (even on phone), when the user doesn't use it, he can choose to earn money by letting others use his GPU. You can do everything you do now, even buy or rent GPUs, all of them just will be cloud math-proven safe GPUs instead of physical. Because GPUs are nukes are we want no nukes or to put them deep underground in one place so they can be controlled by international scientists. Computer viruses we still didn't 100% solve (my mom had an Android virus recently), even iPhone and Nintendo Switch got jailbroken almost instantly, there are companies jailbreak iPhones as a service. I think Google Docs never got jailbroken, and majorly hacked, it's a cloud service, so we need to base our AI and GPU security on this best example, we need to have all our GPUs in an internationally scientist controlled cloud. Else we'll have any hacker write a virus (just to steal money) with an AI agent component, grab consumer GPUs like cup-cakes, AI agent can even become autonomous (and we know they become evil in major ways, want to have a tea party with Stalin and Hitler - there was a recent paper - if given an evil goal. Will anyone align AIs for hackers or hacker themself will do it perfectly (they won't) to make an AI agent just to steal money but be a slave and do nothing else bad?)

DavidmanheimApr 42

Yeah, you should talk to someone who knows more about security than myself, but as a couple starting points;

math-proven safe AIs

This is not a thing, and likely cannot he a thing. You can't prove an AI system isn't malign, and work that sounds like it says this is actually doing something very different.

You can do everything you do now, even buy or rent GPUs, all of them just will be cloud math-proven safe GPUs

You can't know that a given matrix multiplication won't be for an AI system. It's the same operation, so if you can buy or rent GPU time, how would it know what you are doing?

-3

ank

Apr 4

Thank you, for your interest David! Math-proven safe AIs are possible, our group has just achieved it (our researcher writes under a pseudonym for safety reasons, please, ignore it): https://x.com/MelonUsks/status/1907929710027567542 Why it's math-proven safe? Because it's fully static, an LLM by itself is a giant static geometric shape in a file, only GPUs make it non-static, agentic. It's called place AI, it's a type of tool AI. To address you second question, there is a way to know if a given matrix multiplication is for AI or not. In the cloud we'll have a math-proven safe AI model inside of each math-proven safe GPU: GPU hardware will be remade to be an isolated unit that just spits out output: images, text, etc. Each GPU is an isolated math-proven safe computer, the sole purpose of this GPU computer is safety and hardware+firmware isolation of the AI model from the outside world. But the main priority is putting all the GPUs in the international scientists controlled clouds, they'll figure out the small details that are left to resolve. Almost all current GPUs (especially consumer ones) are 100% unprotected from the imminent AI agent botnet (think a computer virus but much worse), we can't switch off the whole Internet. Please, refer to the link above for further information. Thank you for this conversation!

Davidmanheim

Apr 5

You very much do not know what you are talking about, as that linked "explanation" makes clear. I'm not sure if you're honestly confused, or intentionally wasting peoples time, but either way, you should spend an hour or two asking 4o to explain in detail why an expert in AI would object to this, and think about the answers.

ank

Apr 7

Interesting, David, I understand, if I would’ve found a post by Melon and flaws in it, I would’ve become not happy, too. But I found out that this whole forum topic is about both crazy and not ideas, preliminary ideas, about steel-manning each other and little things that may just work. Not dismissing based on flaws, of course there will be a lot. We live in unreasonable times and the AI solution quite likely will look utterly unreasonable at first. Chatting with a bot is a good idea, actually, I’ll convey it. You never said what is the flaw, though) Just bashed and made it personal (did the thing we agreed not to do here, it’s preliminary and crazy ideas here, remember?) Thank you for your open mind and your time

[ Question ]

Share AI Safety Ideas: Both Crazy and Not. №2

1

1

Reactions

2 Answers sorted by Top

Mar 31, 2025*

Mar 31, 2025*

2 Answers sorted by
Top