CG

Craig Green 🔸

Software Developer
27 karmaJoined Working (6-15 years)

Comments
10

I do think the models are the foundation of capability, and I have overstated my case, as I tend to do. What I want to say is that, I think model intelligence has largely steadily scaled, and that when a new application is developed (possible due to sufficient model advances), there is a sudden increase in experienced capability by consumers which feels like a giant leap in model development. That flood of new ability can be attributed to the application inasmuch as it opened the flood gates, but of course, the model is the thing functioning under the hood. To the point about hypey-discourse, I guess I'm just griping about the tendency to allow this optical illusion to influence people's tone and assessment of progress.

It is hard to tell about the AISLE and Anthropic situation because of the very different size of the organizations and the lack of insider knowledge about either of them. To me, the requirement that AISLE replicate Anthropic's findings in whole or in part feels like an unnecessary one to justify their claims. The way I take it is that AISLE's activity has shown that with a proper system, it is already possible with publicly available models to do the sort of bug detection work that made headlines with the Mythos release. That is not to deny that Mythos + system is not an improvement over AISLE's work. Assessing the nature of that improvement is hard for the aforementioned reasons about org scale differences and the general complexity of the thing being compared. It seems all parties agree that Mythos is a big step up in its ability to write exploits. I see no reason to challenge that.

I think its very hard to articulate critiques of hype, and that simultaneously I tend to write in an over-vehement and pugnacious way that makes me quite vulnerable to whatever arguments I would make against someone, so I kind of regret my engagement here, though I do think its true that there is a sort of ineffable tendency to amplify what feel-to-me to be likely reductionisms about model capabilities and how AI systems are engineered.

I took OP as trying to establish that the signal on progress to AGI is quite noisy, and expressing a frustration with narratives that feel too clean or reductionistic about progress. That's highly subjective though. As you note, we probably can't even really define what constitutes significant progress between us, though I suspect we could come to largely agree about the amount of progress made, just not what word to use to describe it.

I do think a fair test of my view point will be if in one year's time we see a proliferation of products/services that do this sort deep bug-finding pipeline. My intuition on this is that cybersecurity is going to go through something similar to what software engineering did last year, driven by the rising tide of model quality in conjunction with a more acute set of innovations in the application layer.

[Edit: I don't think my prediction proves anything actually, since it's coming to pass could reflect many different underlying causalities]

So, a couple of things to note. 

  • AISLE has been operating their agentic system for I think about six months and have found numerous vulnerabilities in highly vetted software themselves of basically the same flavor as the Anthropic announcement. They are not cranks on this topic. See this post for an example.
  • I think you are misunderstanding the purpose of the specific exercise and the broader claims in the AISLE article. The point of the examples on the isolated code snippets is to show models of various levels of size and architecture are quite capable of discovering the bugs. Indeed, model size and architecture seemingly has a complicated relationship to the ability to recognize bugs of various types. They do not attempt to demonstrate how they go about the larger task of exploring and partitioning a codebase for this sort of narrow task, but if you read more of their other posts, you will see that is the exact sort of product they have built and that other clever AI-app developers will probably be producing in the short term future.
  • As to why they didn't just find all of the thousands of unpublished bugs themselves, I think you should consider the following:
    • Anthropic has a huge amount of resources at their disposal. Project Glasswing is providing free compute to partners to the tune of $100,000,000. Per Carl Brown, HackerOne's bug bounty program paid out about $80,000,000 in total last year.
    • Per Anthropic's writeup, they spent $20,000 in compute to discover the OpenBSD bugs.
    • Even without making the obvious inference that Anthropic has spent an astronomical amount of money beyond the amount written above on this endeavor, we can see that these are not costs that AISLE or many other companies would be able to afford for just any arbitrary reason.
  • The claim is not that Anthropic is lying in some simplistic fashion. It is that there is significant and predictable reductionism in the interpretations which this announcement generated, which serve to hype the company up at the expense of the truth.

I'll try to state again my broader theory, which I think is aligned largely with the OP and with the AISLE article, since it seems the point of view of the commentary from these sources is still not being understood.


1. AI-application design (harnesses/scaffolds as they are referred to in the articles) is extremely important to the capability of the AI system. A well-designed harness can enable capability in a relatively less intelligent model that will elude more intelligent models.

Some examples—

  1. It was with the advent of ChatGPT and the underlying helpful assistant post-training that AI exploded into consumer use in the first place. The critical development was at the application layer. Model intelligence had been (to my understanding) steadily advancing up until that point and beyond it.
  2. Claude Code (and Cursor to a significant extent before it) pioneered the coding agent harness, which has massively expanded the utility of LLMs for economically productive work. Throughout the period leading up to agentic coding and beyond it, model intelligence steadily advanced; however, the critical difference occurred with the development of the application.

We saw something similar with OpenClaw several months ago, and likewise, AISLE is the first known-to-me bug-finding application using LLMs (though I'm sure there are others that spawned in the same timeline, and for that matter, Cursor even has something like it in their development platform).

Please note some things about the above:

  1. Model intelligence steadily advanced throughout these periods. The paradigm shift in each case was the application-layer.
  2. In each of the cases listed above, it only took a short while for competitors to replicate the application design.

2. Application design often can mislead users into mistaking what is actually a well-designed narrow loop that coaxes intended behavior out of an LLM for more general intelligence capabilities.

Again, to return to Claude Code and its superlative success compared to other types of LLM economic activity, such as data analysis, financial analysis, etc., computer code has many advantages over these other types of work:

  1. It has to compile.
  2. It is possible to write arbitrary automated tests to verify and explore the functionality of computer code.

Creating a harness that leverages these features was a brilliant innovation, but the domain of coding is far closer to that of a chess game than many other types of knowledge work. It was a more tractable problem for various reasons, that then creates the illusion of generalized capabilities which thus far have not manifested themselves in the broader economy.

My claim to be clear is that:

  1. Mythos is almost certainly going to represent an advancement on current public model capabilities
  2. The bug discoveries are likely more explained in terms of:
    1. The scaffold Anthropic built to deploy the model for this task
    2. The amount of compute they threw at it
  3. The model advancements themselves are only a part of the story and not the largest part either

My prediction is that we will shortly have another DeepSeek moment in the near future where someone successfully builds an open source scaffold that does something like what Mythos and AISLE are doing and then its off to the races as far as cybersecurity goes.

That is on the one hand quite scary but, as the OP said, "I suspect that it may be possible to develop machine learning models that can automatically perform any task with known solutions without this implying any superintelligence takeoff."

Thank you for writing this. I do not agree with either of the criticisms expressed in the other comments. It is clear to me from the title of this article that the point is that more skepticism is appropriate towards the materials published by major AI laboratories, and then the article justifies this by outlining data that is problematic for a naïve interpretation of major lab press publications.

I do not agree with dismissing the writeup by AISLE. They have been publicly doing this work and writing about it for sometime, and in the write-up, they are hardly baselessly critical of Anthropic. Their fundamental point, which is backed up by their own results in their article and other writings, is that the success of models at cybersecurity tasks largely is the result of a larger apparatus around the models. We see similar things with agentic coding, where the harness is as paramount to the actual utility as the specific model.

On the financial side, I agree that EA's should take a more critical stance regarding the financial circumstances of major AI labs. These labs are racing to IPO. The underlying economics of the AI industry are well known to be problematic. You don't have to go full Zitron to see that the financial picture is more complicated than can be inferred by just charting Anthropic's reported ARR growth.

I work with AI everyday as a software engineer. I'm not some sort of luddite, but precisely because of my experience as a consumer of the technology, it is impossible not to notice the marketing hype cycle that has come to engulf the industry. Probably the dominant category of ads I personally see on Facebook now are for coding harnesses by OpenAI and Anthropic. Anyone who peruses the relevant subreddits is used to seeing a flood of astroturfed threads intended to sway readers' loyalties as customers from one to the other. These companies are spending incredible sums of money to market their products, and that should inform how we approach claims made by company figureheads. I still recall the way my stomach churned about a year ago now, maybe a month after the release of deep research, when Sam Altman, having been asked what he does in his free time, responded by saying that of course he doesn't have any free time, but if he did, he would spend it all day reading deep research reports, or something to that effect. For me, that moment was breaking the fourth wall. He was obviously being disingenuous, and so how was I to interpret everything else he had said, which I had been happily nodding along to up until that point?

Doubtless many examples could be added to the OP, but I will satisfy myself with just one. One of the earliest sources of information about Mythos was actually the Claude Code source leak, and one thing we learned from that leak is that the quality of code being generated internally at Anthropic is incredibly low. It is not difficult to find numerous reviews of the Claude Code source tearing it apart for the low quality of craftsmanship and the bugginess of the code therein (links here, here, commentary on the former here). How does that update your priors on the idea that Mythos is a huge leap forward in terms of cybersecurity capabilities? Doubtless there is some sort of way to harmonize the two—and to be clear, I do expect Mythos to be an improvement—but is it possible that current model capabilities are being overstated by an organization pumping itself before an IPO?

None of this is to say that we shouldn't be concerned about AGI. Nor is the point of the OP as I read it that we shouldn't take AGI seriously. It is that it is aggravating to see so many people in EA circles uncritically accept and repeat claims by major AI labs that seem quite dubious. I actually don't see why skepticism of major laboratory pronouncements should have any bearing on our stance on x-risk and AGI. The two issues are not the same, other than that it should cause us to distrust said labs and be more willing to do our own homework. Furthermore, I'm not stating that model capabilities aren't advanced either—I barely ever write code by hand nowadays. Again, I took the point of the OP's article and I agree with it, to be that statements by major labs about model capabilities should not be taken as straightforward recitations of the objective truth. They are embedded in a highly competitive context involving competition for vast sums of money and huge numbers of users, and they are intended to influence that context, including by yes, scaring people into buying a subscription. The OP is attempting to help others see this possibility by providing additional data and argumentation that would be hard to account for if things were as straightforward as major lab publications suggest.

My initial exposure to effective altruism was a more philosophically-oriented forum post on longtermism. At the time, I remembered thinking that (a) the philosophical claims being made don't seem nearly as exciting or novel as the rhetoric in the article presented them to be, and (b) it came across as techno-futurist gobbledygook. At the time, I thought it was just some dude's blog post, and didn't understand that it was a social movement, a community of people trying to do good, somewhere I could go to get advice and support to achieving those goals.

A few years later, unfortunate developments in my church community pushed me to start looking for something else, more ethically serious than the church I had been a part of, and I found GiveWell basically from first principles—I wanted to give my money to somewhere that would use it effectively to mitigate the suffering of real people in dire circumstances. I also, along with a lot of people in the broader tech industry, became much more cognizant of AI developments, so when I returned to this forum a second time, it all made sense, and I could see what I'm missing. The longtermism thing no longer seemed so strange, and I was able to appreciate more the challenge of applying these simply ethical principles to the real world.

If I'm hard on the movement sociologically, its because I think that the communal element is important. I'm a human being. How am I going to maintain resolve to do good, to avoid life style creep, to reduce animal product consumption, etc., if I don't have other people to show me the way, to encourage me, and to mentor? These ethical principles are largely straightforward, and available to all rational people to embrace. But is there an alternative society of likeminded individuals waiting to embrace everyman?

I agree that thinking of these donations in terms of offsetting is not right. Your ability to donate to animal welfare is basically unrelated to your ability to stop consuming animal products, and doing one does not affect your ethical obligation to do the other, as you said.

What I would encourage and do think is right is to consider how you can do the most good, and donating to animal welfare is a highly effective way to do that. Therefore, it seems incumbent upon both vegans and non-vegans that they donate. Being vegan does not free you from the obligation to donate anymore than donating frees you from the obligation to be vegan.

I say this as a non-vegan. I am highly interested in veganism, but do not feel like I can really handle the transition in this current phase of life I'm in. But I resolved to donate, not as an offset, but because I care about animals, and I feel obligated to do the most good I can. I also strive to reduce the amount of animal products I consume, and I try to seek out more humane sources for those I do use.

Doubtless, a vegan looking at my life might question whether the complication is really worth it. I certainly have a guilty conscience and feel empathy for the animals whose suffering I am causing. Am I trying to 'offset' those feelings by doing what I can? Certainly I am to some degree. Whether that is good or bad seems like a personal question, but I think all EA's would agree that, regardless of one's personal moral imbrication in another person's suffering, the goal should be to do as much good as possible.

On the single sentence qualitative feedback, I do think this would be very helpful. Just simple, direct statements such as, "Not qualified due to lacking x/y/z core requirement," versus simply being a weak, but not fundamentally flawed candidate would be very helpful.

Right now, everyone who isn't hired is passed over in favor of stronger applicants. Obviously. I want to know whether my application was even read, to be honest. As a mid-career person trying to transition, I have this growing cynicism that many EA orgs are simply going to filter me based on my age and the fact that I have not worked at some elite firm or gone to a prestigious university. And that's fine actually I guess, but it would be helpful to get direct feedback to let me know whether I'm wasting my time applying in the first place.

I can just earn-to-give and do my own thing, it won't hurt my feelings if I'm excluded from the clique.

IMO, you need to factor in the timeline in which you think AI safety is critical. While dentists may earn more, you are foregoing 4 years of income before you even start earning. You need to determine the rough break-even point at which you would cumulatively have earned more as a dentist, and therefore been able to give more.

If the critical phase of AI safety research precedes that date or is near it, then you may ironically contribute less in terms of marginal value of your giving.

Here's how I see things—

1. If AI advances so quickly that the earning power of CS collapses before you graduate, it is likely that the same will happen to dentists before you graduate from that program. But maybe the latter isn't true, in which case it could be reasonable to pursue dentistry.
2. If AI advances at a moderate pace, the breakeven logic I mentioned above probably means that you will have more impact by getting a moderately well-paying job sooner so that you can give during the critical period of AI development, since your giving would be largely deferred until after AGI if you went into dentistry.
3. If AI advances at a slow pace, then perhaps going into dentistry will ultimately allow you to contribute more.

One possibility you didn't mention, probably because it is unappealing to you—could you just major in dentistry? Then you would get the earning power and reduce the breakeven problem.

You are young. If I were you, out of these two options, I would just major in what I was interested in, and test out my talents. I would major in CS. If I performed exceptionally in AI safety stuff, I would try my hand at a direct career in it. If I didn't, I would focus on getting some other software job with high earning potential.

You are certainly correct that earning-to-give is the rational move when you consider that the constraints are often on resources to fund our goals, rather than on candidates willing to work on them professionally.

Orwell's great. Sometimes cryptic communication is a useful means to communicate to an in group something that you want to hide from the wider audience. For example, a common interpretation of Jesus's parables is that they expressed political ideas cryptically which it would be unacceptable for him to state outright. He always had plausible deniability as to their meaning, which was nonetheless obvious to his hearers. Not really sure what the context is on this board that would require something like that though? Are the EAs liable to call together the council of moderators in the middle of the night and shadow ban someone for wrong think?

This particular metaphor really resonated with me for whatever reason.

I'm trying to career switch. I have small children in the family to care for. My current role is very demanding. I have pretty limited resources to put towards job hunting right now. I did not go to a top college. I'm not an elite applicant, though I've done well for myself in my circumstances, and a lot of my failure to do better is due to prioritizing volunteer and other work. 

To put it crassly, if EA orgs can fully satisfy their staffing needs using recent, EA-aligned graduates of elite colleges, there is no point in me even applying.

The way it feels (when I'm feeling down) is that EA is not really intended for someone like me. The jobs are not there, and while I believe in and practice earning to give, you sometimes get the impression reading the boards that if you aren't a high enough earner, maybe even that isn't really worthwhile, since in an objective sense, it isn't high impact.

And that's fine. Maybe EA can get all it needs from those talent pools, and maybe the urgency of the moment is such that even the money I can give is not that important. Obviously, its feasible that's the case. But then, I'd like to know that, you know?

I do think some sort of moral-weights quizlet thing could be helpful for people to get to know their own values a bit better. GiveWell's models already do this but only for a narrow range of philanthropic endeavors relative to the OP (and they are actual weights for a model, not a pedagogical tool). To be clear, I do not think this would be very rigorous. As others have noted, the various areas are more-or-less speculative in their proposed effects and have more-or-less complete cost-evaluations. But it might help would-be donors to at least start thinking through their values and, based on their interests, it could then point them to the appropriate authorities.

As others have noted, I feel existing chatbots are pretty sufficient for simple search purposes (I found GiveWell through ChatGPT), and on the other hand, existing literature is probably better than any sort of fine-tuned LLM, IMO.

I have no idea what someone in this income-group would do. If I were in that class, being the respecter of expertise that I am, I would not be looking for a chatbot or a quizlet, and would seek out expert advice, so perhaps it is better to focus on getting these hypothetical expert-advisors more visibility?

Load more