I think they're now charging it as extra usage if you use a custom prompt.
In the Claude site they added the option to buy extra usage at 30% off if you buy $1,000 or more at a time, so it's still somewhat cheaper to use OpenClaw with a claude account compared to an API key.
(Incidentally the 30% off might mean that choosing a Pro plan + extra usage versus Max plan might make sense for more people)
I've complained, extensively, about this before but Anthropic really needs to make it clear what is and is not supported with or without a subscription. Until then, it's hard to know where you stand with using their products.
I say all of this as someone who doesn't use OpenClaw or any Claw-like product currently. I just want to know what I can and can't do and currently it's impossible to know.
I have no trouble believing that all labs are trying really hard to come up with an enticing bundle of something works for a wide variety of users, but it's hard to anticipate the popularity of something like OpenClaw, which completely blows through all previous usage patterns at population level.
It seems like a tall order to set lasting rules in this space at this point, where nobody really understands what is going to happen in a few weeks.
They really need to figure out the rules, look I'd love to use a custom harness with Claude Code that I can extend, or build my own (which I'm doing) and use it with my Claude Code license, I don't want to overspend on tokens if I can help it. They really need to set their bar for the next model releases to use less tokens, or to trim their own cost for how these models are run. I'd be okay with a slightly slower experience with Claude Code if it meant similar throughput, but less cost, especially if I can build my own harness for it.
I don’t get why people are so surprised. Didn’t they learn anything from Twitter APIs and the like. The APIs are open as long as they serve the short term problem then Anthropic builds the features people actually use (more or less) and ban the usage of APIs for competing clients
I think a good corollary idea to "vibe coding" is the "vibe product". There is so much stuff popping in and out of existence and my excitement has declined.
The poor communication and flip-flopping are what concern me.
How can I buy into an ecosystem that might disallow one of my main workflows? I currently use several hook scripts to route specific work to different models. Will they disallow that at some point? We don't know because they can't get their story straight.
Given the lack of clear communication and the fact that their primary competitor openly supports the use of bespoke harnesses, I highly doubt this is an incorrect announcement.
Anthropic is destroying goodwill that is hard-won in this space. At the end of the day, people just need to do their work in a way that makes sense for them. In my case (someone who has been building ML/AI tools for 25 years @ MS & Apple), I have much better results using my bespoke harness. If I'm paying $200/month for compute, I should be able to use it in a way that works for me. Given the push back, I'm not alone.
It's saying something about the announcement, it's not saying something about the correctness of the announcement.
I used the word hearsay to imply that flip-flopping should only be a judgement on the comms of the entity accused of flip-flopping, not information living on some third party source.
Same building on their API. You design around what you think is allowed, then a blog post shifts everything. A proper developer policy page would fix this.
Stealibg OAuth keys from first party app to impersonate it in order to not have to pay for usage with properly generated API key was never part of normal use anywhere.
Yeah, the main point here is they had a CLI specifically that allowed you to call Claude, and that was being used. The CLI giving you access should kind of indicate that you should be able to use it as it is defined in the help.
I do agree, though, that the parts of this that were actually using the Claude system to generate OAuth keys themselves are a little sus.
That makes sense to say “must use Claude harness to login before calling Claude cli or using Claude code sdk”
> Anthropic staff told us OpenClaw-style Claude CLI usage is allowed again
Anthropic staff have had contradictive statements in Twitter and have corrected each other. Their intent for clarifications lead to confusion.
> OpenClaw treats Claude CLI reuse and claude -p usage as sanctioned for this integration unless Anthropic publishes a new policy.
Oh cool, so everything is back to business now, until they all or sudden update their policy tomorrow that retracts everything.
Anthropic have proved themselves to be be unreliable when it comes to CC. Switching to other providers is the best way to go, if you want to keep your insanity.
It's the PayPal model of customer service: they'll ban you at any time for any reason or none at all, but if you're very nice they might be willing to have a human look at that decision at some point, but probably not.
Straight up PayPal and Venmo can go to hell. They banned my account, I painted a gun for a guy with a cerakote bale-on setup. He paid with PayPal and they banned me. His gun, just a legal service to paint it.
I pulled our company portal away from PayPal when they refused to restore my account.
Five years later, I tried to re-activate and the human I finally got to effectively told me to fuck off; so I will spend the rest of my life bashing that trash company at every chance.
They had this on here since day 1 of the block. This is just Openclaw saying "if you run Openclaw inside Claude Code, it's compliant with the Anthropic ToS", because, well, it's literally running inside Claude Code.
What's not allowed is grabbing the oauth tokens and using these for your own custom agent, which is what was (and still is) banned.
Nothing has changed, this appears to just be a giant misunderstanding (and probably a poor choice of words from Openclaw).
But there was a period of time when using Openclaw via Claude Code (via -p) was not allowed and it even gave an error message in that case. That's why people find the constantly changing messaging confusing.
The most recent Anthropic announcement was not that people would be banned for using subscriptions with OpenClaw, but that it would be charged as extra usage. I think the reason this was changed three days after that announcement is that being charged for extra usage meant people would not be banned for using their subscription OAuth tokens directly against the Anthropic API with a third party harness, as they had been before. But rather both that usage, and the more recent claude -p usage both be charged as extra usage.
I don't see anything on this page that claims something different from that, or that addresses that claim at all.
> Switching to other providers is the best way to go, if you want to keep your insanity
I remember when I’d periodically rage quit from Uber One to Lyft Pink and back again every time I had a terrible customer-service experience. In the end, I realized picking a demon and getting familiar with its quirks was the better way to go.
I’m currently sticking with Claude, in part because I’m not exposed to this nonsense due to OpenClaw, in larger part because of the Hegseth-Altman DoD nonsense. More broadly, however, I’m not sure if any of Google, Anthropic or OpenAI are coming across as stars in AI communication and customer service.
With Uber and Lyft or even Anthropic versus OpenAI versus <insert flavor of the month here> I don’t even try to attach myself to any one brand.
It’s so easy to switch between all of them. I can open the Uber and Lyft apps and compare in a minute. I can run Claude and ChatGPT in parallel and see which one gets a better handle on the question. I can switch LLM providers with a few minutes of signing up for one and cancelling the other.
They all try to encourage brand lock in but it’s easy to pick up and move if you’re using them for their main service.
you know how s bunch of IT people are trying to "escape the permanent underclass" well it seems like anyone building their tools on cloud providers is doing the opposite. theyre willingly bexoming the underclass in hopes it trickles down
> until they all or sudden update their policy tomorrow that retracts everything.
Oh no. They won't update the policy. Boris or Thariq will casually mention in a random off-hand commebt on Twitter that this is banned now, and then will gaslight everyone that this has always been the case.
Looks like this was restored 2 weeks ago[0], 3 days after Anthropic said OpenClaw requires extra usage[1]. At this point, it's hard to take this seriously. No official statement and not even a tweet?
No, it's just that it's confusing, because there are two ways of using Claude Code credentials:
1. Take the oauth credentials and roll your own agent -- this is NOT allowed
2. Run your agentic application directly in Claude Code -- this IS allowed
When OpenClaw says "Open-Claw style CLI usage", it means literally running OpenClaw in an official Claude Code session. Anthropic has no problems with this, this is compliant with their ToS.
When you use Claude Code's oauth credentials outside of the claude code cli Anthropic will charge you extra usage (API pricing) within your existing subscription.
But... Even when running it in mode 2 ("claude -p") they at certain points tried to detect OpenClaw-usage based prompts made, and blocked them [0]. Now OpenClaw says that Antrophic sanctions this as allowable again.
I agree with GP that this is hard to take seriously.
I mean, if you are them and trying to detect when people are using your system incorrectly the detection system is going to be a little bit flaky. How do they prove you aren't violating your ToS by using OAuth for a system they didn't approve that usage for?
The fault here is not with Anthropic. It lies with cowboy coders creating a system that violates a providers terms of service and creating an adverse relationship.
I have never heard of this, and cannot be reproduced, and is not according to Anthropic's ToS. And there's a lot of FUD being spread around.
They don't ban Openclaw prompts, each custom LLM application provides a client application id (this is how e.g. Openrouter can tell you how popular Openclaw is, and which models are used the most).
> This is slightly different from what OpenCode was banned from doing; they were a separate harness grabbing a user’s Claude Code session and pretending to be Claude Code.
> OpenClaw was still using Claude Code as the harness (via claude -p)[0]. I understand why Anthropic is doing this (and they’ve made it clear that building products around claude -p is disallowed) but I fear Conductor will be next.
If Openclaw was still using Claude Code as the harness, I don't know how to reconcile that with "Openclaw is based on the pi framework", which is decidedly NOT claude code.
From what I understand, they still had the Claude Code harness available, but were mostly fully integrated on the pi agent framework, using Claude Code's oauth credentials directly,
Openclaw allows you to effectively “shell out” to another harness for your model calls, while still using Pi as your main agentic harness. This is the claude -p workflow. Tools and skills are injected into Claude and they hack session persistence into it as well.
They also absolutely blocked OpenClaw system prompts from this path in the prior weeks, based purely on keyword detection. Seems they’ve undone that now.
No, if you ran Openclaw using Anthropic API as a provider, or had it use the ‘claude -p’ cli interface, you got an email from Anthropic threatening a ban unless you upgraded billing.
This was widely reported, and happened to me. You probably can’t reproduce it or see it in docs because they seem to have changed the policy.
And yet running the Claude Code cli with `-p` in ephemeral VMs gets me the "Third-party apps now draw from extra usage, not plan limits. We've added a credit to your organization to get you started. Ask your workspace admin to claim it and keep going." error.
One day you're experimenting just fine. The next, everything breaks.
And I'd gladly use their web containerized agents instead (it would pretty much be the same thing), but we happen to do Apple stuff. So unless we want to dive into relying on ever-changing unreliable toolchains that break every time Apple farts, we're stuck with macOS.
I think this is consistent with the Anthropic announcement. I do not see anything on this page that says it will NOT be charged as extra usage.
The most recent Anthropic announcement was not that people would be banned for using subscriptions with OpenClaw, but that it would be charged as extra usage. I think the reason this was changed three days after that announcement is that being charged for extra usage meant people would not be banned for using their subscription OAuth tokens directly against the Anthropic API with a third party harness, as they had been before. But rather both that usage, and the more recent claude -p usage both be charged as extra usage.
Not that I'm some paragon when it comes to critical thinking exactly, but if there any sort of proof or evidence of Anthropic "silencing negativity"? Wouldn't surprise me, but also haven't seen anything conclusive about it either, so spreading that they are as fact, is ironically FUD itself.
They have, many times. We're seeing a chain where people are pointing to openclaw's github for information (a tool that was effectively acquired by their #1 competitor) and trying to make it sound so crazy. The actual flow was simple- They said "don't use your claude code membership on tools that burn lots of the subsidized tokens". Then a bunch of people raised a fit (because openclaw is almost useless without claude models), so Anthropic basically said "That's what the API keys are for".
Antropic has the info on their website, emailed all users for each step, and I've seen it on X- I'm sure it's in other places as well.
^every comment when someone says something remotely negative about LLM’s and their less useful cousins, cryptocurrencies. It’s baffling how similar the language and attitude is sometimes.
Anthropic was, even to me, “one of the better ones” until recently. They have made many questionable/poor decisions the last 6-8 weeks and people are right to call them out for it, especially when they want our money.
There are bad products and ones that are never used, just to paraphrase. Every single decision of any business gets derided by some segments of its users. You are free to call out Anthropic for anything you are unhappy about, and you are free to switch vendor, but calling them “good” or “bad” just shows your emotional immaturity, or bias.
What's funny is I had personally settled on Anthropic as... the best of a bad situation, I guess? I found the tech useful even if I still deeply hate the industry and hype machine around it. Now though I can't get through a full discussion with Claude before the usage restrictions kick in, which has done a far better job getting me to kick the habit than anything else.
I still VERY occasionally use it (as I'm friggin able to anyway) but it's definitely nowhere near my usage previously. And I refuse to give them money, and besideswhich have no goddamn notion of whether it would even be worth it on the lowest paid tier.
Ah well. The free ride was fun but I knew it had a shelf life.
See the thing is their storefront is so fucking vague. Right now I hit usage limits after about 4-6 messages during the day, depending on length. They say the low tier is 5x usage, so does that mean I can send 20-30 messages? Because that's not remotely worth $20 a month to me.
It used to be. But I consistently got to use more than that.
Funny thing (or I just imagined that), when I used ChatGPT for studying, it was quite generous about over usage.
When I was just messing around, testing where the guardraila are or trying to get it to generate sexual prose about my siblings to send it to them for laughs, the limits were held much more strictly.
I remember when it went up from 25/3h to 50/3h. And I was like meh, because I've already used it over that limit multiple times.
Anthropic is really trying to burn all that goodwill they worked up by raising prices, reducing limits and making it impossible to know what the actual policies are.
> a solid product and company can withstand online controversy
A product with a massive moat. Switching from Claude to another competitor is insanely easy and without much loss of quality. Until they’ve built their moat, burning goodwill is foolish.
What’s different is it’s probably required due to the cash that’s being burnt to operate. They can’t afford to keep offering so much for so little revenue.
If you want LLMs to continue to be offered we have to get to a point where the providers are taking in more money than they are spending hosting them. And we still aren't there (or even close).
Nope. They're losing money on straight inference (you may be thinking of the interview where Dario described a hypothetical company that was positive margin). The only way they can make it look like they're making money on inference is by calling the ongoing reinforcement training of the currently-served model a capital rather than operational expense, which is both absurd and will absolutely not work for an IPO.
Inference, in and of itself, can't be completely unprofitable. Unless you're purely talking about Anthropic?
But
> If you want LLMs to continue to be offered we have to get to a point where the providers are taking in more money than they are spending hosting them
Suggests you just mean in general, as a category, every provider is taking a loss. That seems implausible. Every provider on OpenRouter is giving away inference at a loss? For what purpose?
The open models may not be as great but maybe these are good enough. AI users can switch when the prices rise before it becomes sustainable for (some) of the large LLM providers.
Currently it costs so much more to host an open model than it costs to subscribe to a much better hosted model. Which suggests it’s being massively subsidised still.
For a lot of tasks smaller models work fine, though. Nowadays the problem is less model quality/speed, but more that it's a bit annoying to mix it in one workflow, with easy switching.
I'm currently making an effort to switch to local for stuff that can be local - initially stand alone tasks, longer term a nice harness for mixing. One example would be OCR/image description - I have hooks from dired to throw an image to local translategemma 27b which extracts the text, translates it to english, as necessary, adds a picture description, and - if it feels like - extra context. Works perfectly fine on my macbook.
Another example would be generating documentation - local qwen3 coder with a 256k context window does a great job at going through a codebase to check what is and isn't documented, and prepare a draft. I still replace pretty much all of the text - but it's good at collecting the technical details.
I haven’t tried it yet, but Rapid MLX has a neat feature for automatic model switching. It runs a local model using Apple’s MLX framework, then “falls forward” to the cloud dynamically based on usage patterns:
> Smart Cloud Routing
>
> Large-context requests auto-route to a cloud LLM (GPT-5, Claude, etc.) when local prefill would be slow. Routing based on new tokens after cache hit. --cloud-model openai/gpt-5 --cloud-threshold 20000
I've found MiniMax 2.7 pretty decent and even pay-as-you-go on OpenRouter, it's $0.30/mt in, and $1.20/mt out you can get some pretty heavy usage for between $5-$10. Their token subscription is heavily subsidized, but even if it goes up or away, its pretty decent. I'm pretty hopeful for these openweight models to become affordable at good enough performance.
Edit: I’d also consider waiting for WWDC, they are supposed to be launching the new Mac Studio, an even if you don’t get it, you might be able to snag older models for cheaper
Rapid MLX team has done some interesting benchmarking that suggests Qwopus 27B is pretty solid. Their tool includes benchmarking features so you can evaluate your own setup.
Somethings not adding up. Why is Amazon making financial plans for the next decade based on continued OpenAI spending but you’re saying AI providers like OpenAI and Anthropic aren’t even close to being profitable, so how can they last a decade or more?
That's the interesting question, right? Because if this unwinds during a period of external inflation (say, because of a big war and energy shortage) then even the Bernanke would say helicopter money won't work
A guy from Meta interviewing at BBC a few years ago claimed that every school child in India was going to have the metaverse VR or they'd be left behind in their education, so every family was certainly going to pony up the money.
They probably aren’t planning on making the money on consumer subscriptions. Any price is viable as long as the user can get more value out of it than they spend.
Like with all new products. It takes time to let the market do its work. See if from a positive side. The demand for more and faster and bigger hardware is finally back after 15 years of dormancy. Finally we can see 128gb default memory or 64gb videocards in 2 years from now.
I see the current situation as a plus. I get SOTA models for dumping prices. And once the public providers go up with their pricing, I will be able to switch to local AI because open models have improved so much.
What shareholders, Anthropic is a money burning pit. Not to the same extent as OpenAI, but both will struggle hard to actually turn a profit some day, let alone make back the massive investments they've received.
Not that they don't bring value, I'm just not convinced they'll be able to sell their products in a sticky enough way to make up the prices they'll have to extract to make up for the absurd costs.
>> both will struggle hard to actually turn a profit some day, let alone make back the massive investments they've received.
I'd agree with you, except I've heard this argument before. Amazon, Google, Facebook all burned lots of cash, and folks were convinced they would fail.
On the other hand plenty burned cash and did fail. So could go either way.
I expect, once the market consolidates to 2 big engines, they'll make bonkers money. There will be winners and losers. But I can't tell you which is which yet.
I’m not sure there will be consolidation. There’s too much room for specialization and even when the models are trained to do the same task they have very different qualities and their own strengths and weaknesses. You can’t just swap one for the other. If anything, as hardware improves I’d expect even more models and providers to become available. There’s already an ocean of fine tuned and merged models.
$20B ARR or so reported added in Q1 doesn’t sound particularly bad, they’ll raise effective prices some more while Claude diffuses into the economy, sounds like a money printer. The issue is they’re compute constrained on the supply side to grow faster…
> $20B ARR or so reported added in Q1 doesn’t sound particularly bad
Unless you compare with the reported cash burn or projected losses.
> they’ll raise effective prices some more while Claude diffuses into the economy, sounds like a money printer
But the problem is, they have no moat. Even if Claude diffuses into the economy (still to be seen how much it can effectively penetrate sectors other than engineering, spam, marketing/communications), there is no moat, all providers are interchangeable. If Antrhopic raise the prices too much, switch out to the OpenAI equivalent products.
I disagree very strongly with this, both anecdotally and in the data - subscriptions are growing in all frontier providers; anecdata is right here in HN when you look around almost everyone is talking about CC, codex is a distant second, and completely anecdotally I personally strictly prefer GPT 5.3+ models for backend work and Opus for frontend; Gemini reviews everything that touches concurrency or SQL and finds issues the other models miss.
My general opinion is that models cannot be replaceable, because a model which can replace every other provider must excel at everything all specialist models excel at and that is impossible to serve at scale economically. IOW everyone will have at least two subscriptions to different frontier labs and more likely three.
You're actually reinforcing my point. Models are interchangable and easy to switch between to adjust based on needs and costs. That means that no individual model / model provider has any sort of serious moat.
If tomorrow Kimi release a model better at something, you'd switch to it.
It's likely that Chinese models will get regulatory knee-capped at some point, and the domestic labs all have pretty common costs they need to make up. This creates an environment where they match each other as prices climb. Unless Google/Meta suffocates the startups since they have actual cash flow that is non-AI.
Sure you can go local, but lets be real, that would be <1% of users.
I postulate in practice this won't matter since the space of use cases is so large if Kimi released the absolutely best model at everything they wouldn't be able to serve it (c.f. Mythos).
Aren't they just doing what Hacker News was trying to tell them to do? That AI is useful but not sure if sustainable. Now they're increasing prices and decreasing tokens and you guys are pissed off.
My OpenClaw assistant (who's been using Claude) lost all his personality over the last week, and couldn't figure out how to do things he never had any issues doing.
I racked up about $28 worth of usage and then it just stopped consuming anymore, so I don't know if there was some other issue, but it was persistent.
I got sick of it and used a migration script to move my assistant's history and personality to a claude code config. With the new remote exec stuff, I've got the old functionality back without needing to worry about how bleeding-edge and prone to failure OpenClaw is.
I feel like this is what their plan was all along -- put enough strain and friction on the hobbyist space that people are incentivized to move over to their proprietary solution. It's probably a safer choice anyway -- though I'm sure both are equally vibe-coded.
Well when the middleman between you and your users is bought out by the competitor, it makes sense to move away from it. It's a bit like Apple selling iPhones in a Microsoft store.
Oh that's interesting. Right after they signed the deal with Amazon so maybe it was all compute constrained. In any case, I tried using the Codex $20/mo plan and the limits are so low I can hardly get anywhere before my agent swaps to a different agent.
Somewhat suspicious that if I do this without an official Anthropic notice I'll lose my precious Max $200/mo account so I'll sit tight perhaps for a while.
FAANG already did this all the time isn't it? Regardless of their policy. US is no better than China from my point of view. In this case, I see no difference between sending my prompts to US or China companies. At least China models are open source.
I accept that all the providers will do what I would consider unethical with my data and simply don't expose what I don't consider a price of doing the business I want.
The other criticism I see is "ask it what happened in 1989" but as a my use case isn't writing a high school history essay I simply don't care. Or believe one should seek those kind of answers from any AI. (If you're curious it simply cuts off the reply).
I fully appreciate that YMMV and what sits right for others will not align with what's acceptable to me. Anthropic and OpenAI both are in my badbooks as much as Z.ai. pick your poison as they say.
PSA: Since you are still required to use Claude Code and I have had a bunch of non-technical people asking me to make https://github.com/rcarmo/piclaw based on Claude rather than pi (which is never gonna happen), I have started pivoting its Python grand-daddy into a Go-based web front-end that runs Claude as an ACP agent.
I’ve been using codex cli and GPT 5.4. It is better at coding than Opus anyway. I did not really test Opus 4.7 but older versions generated worse results compared to GPT.
Which I would not even try and test though if Anthropic did not ban my account. The shadiest thing I did was to use it with opencode for a while I think. Never installed claw or used CC tokens somewhere else.
I've been trying to toe the line here myself, here's how I've been doing it. For context, I pay for a Max 5x subscription.
My main goal is to maximize my subscription token usage while trying to comply with the rules, but its not clear where the line is for automation so I feel like I need to be clever.
- regular development (most token use): all interactive claude mode, standard use case
- automated background development: experimenting with claude routines (first-class feature, on subscription)
- personal non-nanoclaw claude automations (claude -p): uses subscription token, but only called as needed (generally just fix something if something in my homelab infra goes does down, its set up to not fire on an exact cron time)
- other LLM based automations: usually openrouter API key, cheap models as needed
- nanoclaw: all API key based, but since its expensive I keep usage mostly minimal and try to defer anything heavyweight to one of the other automation strategies (nanoclaw mainly just connects my homelab infra with telegram)
I got sick of the inconsistency caused by Anthropic tinkering with Claude Code and had canceled my 20x. My plan was to switch to Codex so I could use it in Pi.
I am specifically talking about switching because of the harness, not model quality. Anyone else match my experience?
I wonder how many other people recently did the same. It would be prudent of Anthropic to let people use Pro/Max OAuth tokens with other harnesses I think. Even though I get why they want to own the eyeballs.
I’ve been using Codex Pro since they lobotomized Opus 4.6. Codex is so much better, GPT 5.4 xhigh fast is definitely the smartest and fastest model available.
For a while there I had both Opus 4.6 and Codex access and I frequently pitted them against each other, I never once saw Opus come out ahead. Opus was good as a reviewer though, but as an implementer it just felt lazy compared to 5.4 xhigh.
One feature that I haven’t seen discussed that much is how codex has auto-review on tool runs. No longer are you a slave to all or nothing confirmations or endless bugging, it’s such a bad pattern.
Even in a week of heavy duty work and personal use I still haven’t been able to exhaust the usage on the $200 plan.
I’ll probably change my mind when (not IF) OpenAI rug pull, but for spring ‘26, codex is definitely the better deal.
I also made the switch to OpenAI, the $20 plan, I dunno about "so much better" but it's more or less the same, which is great!
The models and tools levelling out is great for users because the cost of switching is basically nil. I'm reading people ITT saying they signed up for a year - big mistake. A year is a decade right now.
It really depends on what you‘re trying to do and what your skillset is.
But if you go information architecture first and have that codified in some way (espescially if you already have the templates), then you can nudge any agent to go straight into CSS and it will produce something reasonable.
I've been on pi for a few months now, build a custom tmux plugin so i can use nested pi and mix and match codex / claude instances.
pi has been the better harness out of all the ones i tried, first and third party.
Ever since the Anthropic block i've just canceled all my claude subs. Used to be codex was a bit worse, now they're practically equal. Claude is slightly better at directing other agents but the difference is too minor and not worth the money.
Claude usage limits / costs are absurd.
Any 'principles' people praise anthropic for are not that relevant to me anyways because i'm not a US citizen.
I left anthropic a while ago because of the similar shenanigans they had earlier. I went with opencode & zen.
I still have their subscription, but am using pi now, mainly because something happened that made my opencode sessions unusable (cannot continue them, just blanks out, I assume something in the sqlite is fucked), and I cannot be bothered to debug it.
For what I use the agents, the Chinese models are enough
Doesn't using pi be against their terms of use about having to go through Claude Code cli for all Max plan usage? (I had use Droid with Max previously, it was a great combo).
It's unclear right now. The current stance is that using pi or other coding harnesses eats into extra usage and that is the behavior one sees today. We have added a hint to pi now that warns you when you use an anthropic sub.
I also cancelled my 20x and switched to Codex. At this point even the Codex CLI seems to perform better than Claude Code... And so far I'm on the OpenAI Pro plan and haven't even needed to upgrade to their $100/mo plan. I'm getting more value for almost 10x cheaper.
(Disclosure: I work on tamer, an OSS supervisor for coding agents — biased.)
Add one more to the count. The OAuth-across-harnesses idea would help, but it doesn't fix the shape of the problem.
"Harness" has always felt off to me. Exoskeleton is closer — Claude Code, Codex, opencode wrap the model and augment it from the inside.
What's missing is a layer above that's explicitly not an exoskeleton: a thin supervisor. A master that watches and guides, nothing more. It just relays I/O and hands approval back to the human.
My experience is the opposite of this thread's consensus. Context: Full time SWE, working on large and messy codebase. Not working on crazy automations, working on fixing bugs, troubleshooting crashes, implementing features.
Anthropic models write much better code, they are easy to follow, reasonable and very close to what I would done if I had the time... OpenAI's on the other hand generate extremely complex solutions to the simplest problems.
I was so disappointed by non-Anthropic models, that for a couple of weeks I only used Anthropic models, but based on this thread, I'll go back and give it another try. It's good to go back and try things again every couple of weeks.
Of course, I was annoyed that they lobotomized 4.6, the difference was day and night, and Anthropic is certainly not a company I trust. In my opinion, it shows their willingness to rugpull, so I'm looking at other approaches. Since 4.7, things went back to normal, things you'd expect to work just work.
> I wonder how many other people recently did the same.
Some negative signal for better overall view on things: I'm still with Anthropic and will probably stay with them for the foreseeable future.
I think after DoD/DoW shenanigans (which in of itself felt like a reasonable take on the part of Anthrpic) they got a bunch of visibility and new users, so them hitting some scaling limits is pretty much inevitable - so some service disruption is inevitable. Couple this with the tokenizer changes and seeming decrease in model performance (adaptive thinking etc.), and lots of people will be rightfully pissed off, alongside increased downtime (doesn't matter that much for me, definitely does matter for anything time-sensitive).
At the same time, in practice I've only seen it do stupid things across 8 million tokens about 5 times (confusing user/assistant roles, not reading files that should be obvious for a given use case, and picking trivially wrong/stupid solutions when planning things), alongside another 4 times that tests/my ProjectLint tool caught that I would have missed. The error rate is still arguably lower than mine, though I work in a very well known and represented domain (webdev with a bunch of DevOps and also some ML stuff, and integration with various APIs etc.).
At the same time, the 85 EUR they gave to me for free has been enough to weather the instability in regards to pricing changes and peak usage. They've fixed most of the issues I had with Claude Code (notably performance), and the sub-agent support is great and it's way better than OpenCode in my experience. They also keep shipping new features that are pretty nice, like Dispatch and Routines and Design, those features also seem nice and not like something completely misdirected, so that's nice. The Opus 4.7 model quality with high reasoning is actually pretty nice as well and works better than most of the other models I've tried (OpenAI ones are good, I just prefer Claude phrasing/language/approaches/the overall vibe, not even sure what I'd call it exactly, all the stuff in addition to the technical capabilities).
At the same time, if they mess too much with the 100 USD tier, I bet I could go to OpenAI or try out the GLM 5.1 subscription without too many issues. For now they're replacing all the other providers for me. Oh also I find the subscription vs API token-based payment approach annoying, but I guess that's how they make their money.
Because the Harness is the Moat and key IP not the Models themselves that is the why! now for both OpenAI and Anthropic with all their money raised and the compute they acquire and have in the books of course no one can easily replicate, whom can afford all those datacenters and Nvidia GPUs interconnected is why OpenAI throws you a bone and gives you an Open Source SDK Harness but not the one they actually use for ChatGPT. But now both of them have to deliver and do all the bull-shet they said this models can do... truth is they cannot. So now the bubbles burst and we will see what happens. We all have to buy iPhones or MacBooks so that makes sense, we all use Chrome or Google Search, Instagram, TikTok.
All these models and agents are shortcuts for all of us to be lazy and play games and watch YouTube or Netflix because we use them to work-less, well the party will be over soon.
I don’t think I’ve seen a more confused and shambolic product strategy since Google’s absurd line of GChat rebrandings.
Last year I was excited about the constant forward progress on models but since February or so its just been a mess and I want off this ride.
Either way I’m going to wait for “official” word from Anthropic, which I guess at this point will probably be a “Tell HN” or Reddit text post or a Xitter from some random employee’s personal account, because apparently that’s the state of corporate communication now.
I didn't even use openclaw and Anthropic disabled my account without explanation beyond "suspicious signals". If anyone found a way to get out of that, I'd be curious to hear it - genuinely no idea what I did wrong, and the Google docs form I filled out to appeal never got me any reply.
Same thing happened to me in January. Never heard back from them after submitting the google form. A few weeks ago I went through the subscription flow again and the 'account disabled' message was no longer there. Didn't go through with the payment so it's possible I would have been blocked at that point but it looked like my account had been re-enabled. I think you just have to play the waiting game unfortunately.
Whether to allow Claude subscription to access other services or not, at this point, anthropic seems to be schizophrenic, sometimes worried about insufficient computing power and sometimes worried about user loss, which is puzzling.
What's puzzling or schizophrenic about that? Those seem like two very natural factors that would be in tension with one another and have to be balanced.
I'm out of the loop on Claude, hasn't it always been possible to use the Anthropic API with a tool like OpenClaw, paying per request? Is this limitation just for using your monthly subscription account?
I find it a little bizarre that people have this expectation. You can still pay for compute and use it the way you want by paying for the product you actually want to use. Subscription products like this are not marketed or intended to be used as access to the API, but they also offer access to the API if that's what you want. I'm still not entirely clear why people insist on using their subscription like this, so let me know if I'm missing something.
> I find it a little bizarre that people have this expectation.
Well, enough people complained that Anthropic reversed their stance. Additionally, their primary competitor doesn't have any compute restrictions, which should help clarify why this decision was made.
As someone who has been building ML/AI tools (@ MS & Apple) for almost 25 years, I can say that much of the value of the underlying model comes from the harness. Why shouldn't I be able to use the exact same compute with my own bespoke harness when the compute cost is the same?
The Claude Code team continues to push out half-baked features that literally hamper my ability to use their tools.
If I'm paying $200/month for compute, I should be able to use it however I like.
This is only useful when you are using Claude Cli fairly regularly on the same machine as OpenClaw, right? Because the tokens need to be refreshed manually every so often?
They see that the new KimiK2.6 will eat their lunch. They don't care about you, they just care about your money and will take away your options if they don't believe you have a solid alternative.
What models have you guys tried to use with OpenClaw that you've found suitable for the task? Codex personally rules for my dev style but not sure how well it works in the claw scenario.
This is a perfect example of how quickly you can burn through trust that took a long time to earn.
I used to be - in my small circle of friends and peers - a genuine advocate for Anthropic and Claude. It was my sole AI assistant for over a year. But somewhere around February/March, something shifted. Declining quality, policy changes, inconsistent output. Nothing dramatic, just... a slow erosion.
That erosion pushed me to try Codex. I signed up for their most expensive pro plan. Now I'm about to experiment with Kimi. I'm not saying they're better (well, sometimes they are). But here's the thing - what Anthropic did is they made me look. They made a loyal customer start shopping around. And I think that's the worst thing you can do.
Having said that - as an LLM provider for my product, we're staying with Claude. I still trust in their ethics. Please don't prove me wrong.
I'm trying out codex for first time as well cause something up with Claude for sure, 4.7 has been super frustrating. For other models, highly recommend trying MiniMax 2.7, using it with Hermes is actually pretty good, and their token subscription plans include a lot of usage for $10.
Interesting perspective on AI CLI tools. The Anthropic policy clarification is a significant development for the developer community. Would be curious about the implementation details.
Anthropic keeps conflating two distinct strategies — be the best model for developers to build on, or be the company that ships Claude Code. Those two have opposite policy conclusions. Restricting third-party harnesses maximizes Claude Code revenue; allowing them maximizes model-layer lock-in through developer habit. The whiplash is the symptom of not picking. Pick for crying out loud!
Uh, what? For the love of God can I make my own harness or not? Or is this just saying you can use it only in API mode?
I have had some ideas for a custom harness (like embedding some tools OOTB and replacing slow tooling) but these policies throw me off. Instead I use local models.
Problem is API costs are insane. I have toyed with the idea of running a local model that works with Claude Sonnet or even Haiku, and I know this has been done by others.
Or Claw-like harnesses that we make ourselves? It takes honestly like 15 minutes to roll your own, so I did it thinking "well, hopefully it's not considered third party"
The problem is these tools are so important I'm never going to risk Anthropic blocking my account now after the last debacle. So I'll be used OpenAI with OpenClaw. Hard to win back trust.
Great so now we can all look forward to Claude progressively getting reduced limits again. How long till the $1000 ultra plan... or they just want us all paying API credits instead
In the Claude site they added the option to buy extra usage at 30% off if you buy $1,000 or more at a time, so it's still somewhat cheaper to use OpenClaw with a claude account compared to an API key.
(Incidentally the 30% off might mean that choosing a Pro plan + extra usage versus Max plan might make sense for more people)
I've complained, extensively, about this before but Anthropic really needs to make it clear what is and is not supported with or without a subscription. Until then, it's hard to know where you stand with using their products.
I say all of this as someone who doesn't use OpenClaw or any Claw-like product currently. I just want to know what I can and can't do and currently it's impossible to know.
It seems like a tall order to set lasting rules in this space at this point, where nobody really understands what is going to happen in a few weeks.
How can I buy into an ecosystem that might disallow one of my main workflows? I currently use several hook scripts to route specific work to different models. Will they disallow that at some point? We don't know because they can't get their story straight.
Anthropic is destroying goodwill that is hard-won in this space. At the end of the day, people just need to do their work in a way that makes sense for them. In my case (someone who has been building ML/AI tools for 25 years @ MS & Apple), I have much better results using my bespoke harness. If I'm paying $200/month for compute, I should be able to use it in a way that works for me. Given the push back, I'm not alone.
How, exactly, is that not saying something about the announcement?
I used the word hearsay to imply that flip-flopping should only be a judgement on the comms of the entity accused of flip-flopping, not information living on some third party source.
I do agree, though, that the parts of this that were actually using the Claude system to generate OAuth keys themselves are a little sus.
That makes sense to say “must use Claude harness to login before calling Claude cli or using Claude code sdk”
Anthropic staff have had contradictive statements in Twitter and have corrected each other. Their intent for clarifications lead to confusion.
> OpenClaw treats Claude CLI reuse and claude -p usage as sanctioned for this integration unless Anthropic publishes a new policy.
Oh cool, so everything is back to business now, until they all or sudden update their policy tomorrow that retracts everything.
Anthropic have proved themselves to be be unreliable when it comes to CC. Switching to other providers is the best way to go, if you want to keep your insanity.
Best and most applicable typo ever ʕ ´ • ᴥ •̥ ` ʔ
At least the only action I was still able to perform was to refund the user, or paypal would have just kept the money.
I pulled our company portal away from PayPal when they refused to restore my account.
Five years later, I tried to re-activate and the human I finally got to effectively told me to fuck off; so I will spend the rest of my life bashing that trash company at every chance.
It's just OpenClaw people claiming "Anthropic told us it's fine".
What's not allowed is grabbing the oauth tokens and using these for your own custom agent, which is what was (and still is) banned.
Nothing has changed, this appears to just be a giant misunderstanding (and probably a poor choice of words from Openclaw).
https://x.com/steipete/status/2040811558427648357
I don't see anything on this page that claims something different from that, or that addresses that claim at all.
I remember when I’d periodically rage quit from Uber One to Lyft Pink and back again every time I had a terrible customer-service experience. In the end, I realized picking a demon and getting familiar with its quirks was the better way to go.
I’m currently sticking with Claude, in part because I’m not exposed to this nonsense due to OpenClaw, in larger part because of the Hegseth-Altman DoD nonsense. More broadly, however, I’m not sure if any of Google, Anthropic or OpenAI are coming across as stars in AI communication and customer service.
It’s so easy to switch between all of them. I can open the Uber and Lyft apps and compare in a minute. I can run Claude and ChatGPT in parallel and see which one gets a better handle on the question. I can switch LLM providers with a few minutes of signing up for one and cancelling the other.
They all try to encourage brand lock in but it’s easy to pick up and move if you’re using them for their main service.
There hasnt been near the confusion and drama surrounding things like codex and gemini-cli. I don't think they're all on the same pedestal right now
Oh no. They won't update the policy. Boris or Thariq will casually mention in a random off-hand commebt on Twitter that this is banned now, and then will gaslight everyone that this has always been the case.
[0]: https://github.com/openclaw/openclaw/commit/d378a504ac17eab2...
[1]: https://news.ycombinator.com/item?id=47633396
1. Take the oauth credentials and roll your own agent -- this is NOT allowed
2. Run your agentic application directly in Claude Code -- this IS allowed
When OpenClaw says "Open-Claw style CLI usage", it means literally running OpenClaw in an official Claude Code session. Anthropic has no problems with this, this is compliant with their ToS.
When you use Claude Code's oauth credentials outside of the claude code cli Anthropic will charge you extra usage (API pricing) within your existing subscription.
I agree with GP that this is hard to take seriously.
[0]: https://x.com/steipete/status/2040811558427648357
But then the Claude Code product manager said:
> This is not intentional, likely an overactive abuse classifier. Looking, and working on clarifying the policy going forward.
https://xcancel.com/bcherny/status/2041035127430754686#m
The fault here is not with Anthropic. It lies with cowboy coders creating a system that violates a providers terms of service and creating an adverse relationship.
They don't ban Openclaw prompts, each custom LLM application provides a client application id (this is how e.g. Openrouter can tell you how popular Openclaw is, and which models are used the most).
Anthropic just checks for that.
> This is slightly different from what OpenCode was banned from doing; they were a separate harness grabbing a user’s Claude Code session and pretending to be Claude Code.
> OpenClaw was still using Claude Code as the harness (via claude -p)[0]. I understand why Anthropic is doing this (and they’ve made it clear that building products around claude -p is disallowed) but I fear Conductor will be next.
From what I understand, they still had the Claude Code harness available, but were mostly fully integrated on the pi agent framework, using Claude Code's oauth credentials directly,
They also absolutely blocked OpenClaw system prompts from this path in the prior weeks, based purely on keyword detection. Seems they’ve undone that now.
This was widely reported, and happened to me. You probably can’t reproduce it or see it in docs because they seem to have changed the policy.
One day you're experimenting just fine. The next, everything breaks.
And I'd gladly use their web containerized agents instead (it would pretty much be the same thing), but we happen to do Apple stuff. So unless we want to dive into relying on ever-changing unreliable toolchains that break every time Apple farts, we're stuck with macOS.
The most recent Anthropic announcement was not that people would be banned for using subscriptions with OpenClaw, but that it would be charged as extra usage. I think the reason this was changed three days after that announcement is that being charged for extra usage meant people would not be banned for using their subscription OAuth tokens directly against the Anthropic API with a third party harness, as they had been before. But rather both that usage, and the more recent claude -p usage both be charged as extra usage.
Release notes and announcements are a well-known agentic anti-pattern.
If you're doing them, you're doing agentic wrong. /s-ish-also-cry
Though, I don't think that justifies spreading FUD in the opposite direction. I also don't think the comment the GP was replying to contains FUD.
Antropic has the info on their website, emailed all users for each step, and I've seen it on X- I'm sure it's in other places as well.
Anthropic was, even to me, “one of the better ones” until recently. They have made many questionable/poor decisions the last 6-8 weeks and people are right to call them out for it, especially when they want our money.
I still VERY occasionally use it (as I'm friggin able to anyway) but it's definitely nowhere near my usage previously. And I refuse to give them money, and besideswhich have no goddamn notion of whether it would even be worth it on the lowest paid tier.
Ah well. The free ride was fun but I knew it had a shelf life.
I will say that Codex high/x-high has consistently performed the best for me, but YMMV
Funny thing (or I just imagined that), when I used ChatGPT for studying, it was quite generous about over usage. When I was just messing around, testing where the guardraila are or trying to get it to generate sexual prose about my siblings to send it to them for laughs, the limits were held much more strictly.
I remember when it went up from 25/3h to 50/3h. And I was like meh, because I've already used it over that limit multiple times.
Google when they merged YouTube and Google+, Reddit multiple times, Facebook after countless scandals. Microsoft destroying windows and pushing ads.
At the end of the day a solid product and company can withstand online controversy.
A product with a massive moat. Switching from Claude to another competitor is insanely easy and without much loss of quality. Until they’ve built their moat, burning goodwill is foolish.
What’s different is it’s probably required due to the cash that’s being burnt to operate. They can’t afford to keep offering so much for so little revenue.
But
> If you want LLMs to continue to be offered we have to get to a point where the providers are taking in more money than they are spending hosting them
Suggests you just mean in general, as a category, every provider is taking a loss. That seems implausible. Every provider on OpenRouter is giving away inference at a loss? For what purpose?
Half the articles are paywalled but the free ones outline the financial situation of the SOTA providers and he has receipts
I'm currently making an effort to switch to local for stuff that can be local - initially stand alone tasks, longer term a nice harness for mixing. One example would be OCR/image description - I have hooks from dired to throw an image to local translategemma 27b which extracts the text, translates it to english, as necessary, adds a picture description, and - if it feels like - extra context. Works perfectly fine on my macbook.
Another example would be generating documentation - local qwen3 coder with a 256k context window does a great job at going through a codebase to check what is and isn't documented, and prepare a draft. I still replace pretty much all of the text - but it's good at collecting the technical details.
> Smart Cloud Routing > > Large-context requests auto-route to a cloud LLM (GPT-5, Claude, etc.) when local prefill would be slow. Routing based on new tokens after cache hit. --cloud-model openai/gpt-5 --cloud-threshold 20000
https://github.com/raullenchai/Rapid-MLX
Edit: I’d also consider waiting for WWDC, they are supposed to be launching the new Mac Studio, an even if you don’t get it, you might be able to snag older models for cheaper
100% agree. I’m just looking forward to setting something up in my electronic closet that I can remote to instead of having everything tracked.
They have a metric called Model-Harness Index:
MHI = 0.50 × ToolCalling + 0.30 × HumanEval + 0.20 × MMLU (scale 0-100)
https://github.com/raullenchai/Rapid-MLX
I understand why they have to charge more, but not many are gonna be able to afford even $100 a month, and that doesn't seem to be sufficient.
It has to come with some combination of better algorithms or better hardware.
Who’s wrong?
Not that they don't bring value, I'm just not convinced they'll be able to sell their products in a sticky enough way to make up the prices they'll have to extract to make up for the absurd costs.
I'd agree with you, except I've heard this argument before. Amazon, Google, Facebook all burned lots of cash, and folks were convinced they would fail.
On the other hand plenty burned cash and did fail. So could go either way.
I expect, once the market consolidates to 2 big engines, they'll make bonkers money. There will be winners and losers. But I can't tell you which is which yet.
Unless you compare with the reported cash burn or projected losses.
> they’ll raise effective prices some more while Claude diffuses into the economy, sounds like a money printer
But the problem is, they have no moat. Even if Claude diffuses into the economy (still to be seen how much it can effectively penetrate sectors other than engineering, spam, marketing/communications), there is no moat, all providers are interchangeable. If Antrhopic raise the prices too much, switch out to the OpenAI equivalent products.
I disagree very strongly with this, both anecdotally and in the data - subscriptions are growing in all frontier providers; anecdata is right here in HN when you look around almost everyone is talking about CC, codex is a distant second, and completely anecdotally I personally strictly prefer GPT 5.3+ models for backend work and Opus for frontend; Gemini reviews everything that touches concurrency or SQL and finds issues the other models miss.
My general opinion is that models cannot be replaceable, because a model which can replace every other provider must excel at everything all specialist models excel at and that is impossible to serve at scale economically. IOW everyone will have at least two subscriptions to different frontier labs and more likely three.
If tomorrow Kimi release a model better at something, you'd switch to it.
Sure you can go local, but lets be real, that would be <1% of users.
I postulate in practice this won't matter since the space of use cases is so large if Kimi released the absolutely best model at everything they wouldn't be able to serve it (c.f. Mythos).
hn is not a monolith. People here routinely disagree with each other, and that's what makes it great
I racked up about $28 worth of usage and then it just stopped consuming anymore, so I don't know if there was some other issue, but it was persistent.
I got sick of it and used a migration script to move my assistant's history and personality to a claude code config. With the new remote exec stuff, I've got the old functionality back without needing to worry about how bleeding-edge and prone to failure OpenClaw is.
I feel like this is what their plan was all along -- put enough strain and friction on the hobbyist space that people are incentivized to move over to their proprietary solution. It's probably a safer choice anyway -- though I'm sure both are equally vibe-coded.
Somewhat suspicious that if I do this without an official Anthropic notice I'll lose my precious Max $200/mo account so I'll sit tight perhaps for a while.
I had an idea on a whim to vibe-engineer an irccloud replacement for myself.
Started with claude web + Opus 4.7 and continued with Claude Code. Ate up two full cycles of my quota in maybe 6-10 prompts.
Then I iterated on that with pi.dev+codex for HOURS, managed to use 50% of my Codex Pro subscription.
With Claude it's a constant battle of typing /usage after every iteration and trying to guess if it's enough for the next task or not =)
I used to use GLM mostly and had a Claude Pro subscription for occasional review and clean up.
Now I just use GLM.
I do think Claude Max is value for money. But it's more value than I personally need and I like Anthropic less and less.
The other criticism I see is "ask it what happened in 1989" but as a my use case isn't writing a high school history essay I simply don't care. Or believe one should seek those kind of answers from any AI. (If you're curious it simply cuts off the reply).
I fully appreciate that YMMV and what sits right for others will not align with what's acceptable to me. Anthropic and OpenAI both are in my badbooks as much as Z.ai. pick your poison as they say.
Still early days, but code is available, sort of works if you squint, and welcomes PRs: https://github.com/rcarmo/vibes/tree/go
Question to the sages: should that submission get flagged because of that?
Which I would not even try and test though if Anthropic did not ban my account. The shadiest thing I did was to use it with opencode for a while I think. Never installed claw or used CC tokens somewhere else.
This is a weird company doing weird shit.
My main goal is to maximize my subscription token usage while trying to comply with the rules, but its not clear where the line is for automation so I feel like I need to be clever.
- regular development (most token use): all interactive claude mode, standard use case
- automated background development: experimenting with claude routines (first-class feature, on subscription)
- personal non-nanoclaw claude automations (claude -p): uses subscription token, but only called as needed (generally just fix something if something in my homelab infra goes does down, its set up to not fire on an exact cron time)
- other LLM based automations: usually openrouter API key, cheap models as needed
- nanoclaw: all API key based, but since its expensive I keep usage mostly minimal and try to defer anything heavyweight to one of the other automation strategies (nanoclaw mainly just connects my homelab infra with telegram)
I am specifically talking about switching because of the harness, not model quality. Anyone else match my experience?
I wonder how many other people recently did the same. It would be prudent of Anthropic to let people use Pro/Max OAuth tokens with other harnesses I think. Even though I get why they want to own the eyeballs.
For a while there I had both Opus 4.6 and Codex access and I frequently pitted them against each other, I never once saw Opus come out ahead. Opus was good as a reviewer though, but as an implementer it just felt lazy compared to 5.4 xhigh.
One feature that I haven’t seen discussed that much is how codex has auto-review on tool runs. No longer are you a slave to all or nothing confirmations or endless bugging, it’s such a bad pattern.
Even in a week of heavy duty work and personal use I still haven’t been able to exhaust the usage on the $200 plan.
I’ll probably change my mind when (not IF) OpenAI rug pull, but for spring ‘26, codex is definitely the better deal.
The models and tools levelling out is great for users because the cost of switching is basically nil. I'm reading people ITT saying they signed up for a year - big mistake. A year is a decade right now.
Now with Opus 4.7 of course the “burden” of adjusting reasoning effort has been taken away from you even at the API level.
In my experience people don’t change the thinking level at all.
Codex is abysmal for UI design imo.
But if you go information architecture first and have that codified in some way (espescially if you already have the templates), then you can nudge any agent to go straight into CSS and it will produce something reasonable.
pi has been the better harness out of all the ones i tried, first and third party.
Ever since the Anthropic block i've just canceled all my claude subs. Used to be codex was a bit worse, now they're practically equal. Claude is slightly better at directing other agents but the difference is too minor and not worth the money.
Claude usage limits / costs are absurd.
Any 'principles' people praise anthropic for are not that relevant to me anyways because i'm not a US citizen.
I still have their subscription, but am using pi now, mainly because something happened that made my opencode sessions unusable (cannot continue them, just blanks out, I assume something in the sqlite is fucked), and I cannot be bothered to debug it.
For what I use the agents, the Chinese models are enough
Plus I like being able to switch a model.
Had to stop because they don't like us proxying requests anymore.
Anthropic models write much better code, they are easy to follow, reasonable and very close to what I would done if I had the time... OpenAI's on the other hand generate extremely complex solutions to the simplest problems.
I was so disappointed by non-Anthropic models, that for a couple of weeks I only used Anthropic models, but based on this thread, I'll go back and give it another try. It's good to go back and try things again every couple of weeks.
Of course, I was annoyed that they lobotomized 4.6, the difference was day and night, and Anthropic is certainly not a company I trust. In my opinion, it shows their willingness to rugpull, so I'm looking at other approaches. Since 4.7, things went back to normal, things you'd expect to work just work.
Some negative signal for better overall view on things: I'm still with Anthropic and will probably stay with them for the foreseeable future.
I think after DoD/DoW shenanigans (which in of itself felt like a reasonable take on the part of Anthrpic) they got a bunch of visibility and new users, so them hitting some scaling limits is pretty much inevitable - so some service disruption is inevitable. Couple this with the tokenizer changes and seeming decrease in model performance (adaptive thinking etc.), and lots of people will be rightfully pissed off, alongside increased downtime (doesn't matter that much for me, definitely does matter for anything time-sensitive).
At the same time, in practice I've only seen it do stupid things across 8 million tokens about 5 times (confusing user/assistant roles, not reading files that should be obvious for a given use case, and picking trivially wrong/stupid solutions when planning things), alongside another 4 times that tests/my ProjectLint tool caught that I would have missed. The error rate is still arguably lower than mine, though I work in a very well known and represented domain (webdev with a bunch of DevOps and also some ML stuff, and integration with various APIs etc.).
At the same time, the 85 EUR they gave to me for free has been enough to weather the instability in regards to pricing changes and peak usage. They've fixed most of the issues I had with Claude Code (notably performance), and the sub-agent support is great and it's way better than OpenCode in my experience. They also keep shipping new features that are pretty nice, like Dispatch and Routines and Design, those features also seem nice and not like something completely misdirected, so that's nice. The Opus 4.7 model quality with high reasoning is actually pretty nice as well and works better than most of the other models I've tried (OpenAI ones are good, I just prefer Claude phrasing/language/approaches/the overall vibe, not even sure what I'd call it exactly, all the stuff in addition to the technical capabilities).
At the same time, if they mess too much with the 100 USD tier, I bet I could go to OpenAI or try out the GLM 5.1 subscription without too many issues. For now they're replacing all the other providers for me. Oh also I find the subscription vs API token-based payment approach annoying, but I guess that's how they make their money.
All these models and agents are shortcuts for all of us to be lazy and play games and watch YouTube or Netflix because we use them to work-less, well the party will be over soon.
Last year I was excited about the constant forward progress on models but since February or so its just been a mess and I want off this ride.
Either way I’m going to wait for “official” word from Anthropic, which I guess at this point will probably be a “Tell HN” or Reddit text post or a Xitter from some random employee’s personal account, because apparently that’s the state of corporate communication now.
I'm confused by the comments being full of people swearing off Claude, feels like real HN bubble stuff.
If I'm paying for compute, why should it matter whether I use Anthropic's harness (e.g., Claude Code) or a 3rd-party harness?
Well, enough people complained that Anthropic reversed their stance. Additionally, their primary competitor doesn't have any compute restrictions, which should help clarify why this decision was made.
As someone who has been building ML/AI tools (@ MS & Apple) for almost 25 years, I can say that much of the value of the underlying model comes from the harness. Why shouldn't I be able to use the exact same compute with my own bespoke harness when the compute cost is the same?
The Claude Code team continues to push out half-baked features that literally hamper my ability to use their tools.
If I'm paying $200/month for compute, I should be able to use it however I like.
With Claude Code they can predict what the traffic would look like with third party harness they cannot.
Anthropic is constantly destroying goodwill and now seems to be in panic mode.
Contrast that to what GitHub did which was to pause new customers to ensure quality remained and things were stable.
hot damn
That erosion pushed me to try Codex. I signed up for their most expensive pro plan. Now I'm about to experiment with Kimi. I'm not saying they're better (well, sometimes they are). But here's the thing - what Anthropic did is they made me look. They made a loyal customer start shopping around. And I think that's the worst thing you can do.
Having said that - as an LLM provider for my product, we're staying with Claude. I still trust in their ethics. Please don't prove me wrong.
I have had some ideas for a custom harness (like embedding some tools OOTB and replacing slow tooling) but these policies throw me off. Instead I use local models.
Problem is API costs are insane. I have toyed with the idea of running a local model that works with Claude Sonnet or even Haiku, and I know this has been done by others.
Use something else.