> Apple runs on Anthropic at this point. Anthropic is powering a lot of the stuff Apple is doing internally in terms of product development, a lot of their internal tools…They have custom versions of Claude running on their own servers internally.
Okay, but why is the Siri team sitting out transformers. I really wanna move past the „Dragon Naturally Speaking“ experience with a bolted on decision tree.
Who’s doing it better? I have yet to hear from a Google or Amazon user who has a transformatively better experience, and I think that’s why they haven’t jumped so far because they have hundreds of millions of users who have daily habits that they don’t want to lightly disturb.
Feeling #blessed that I apparently have the exact same upper midwest accent they must've trained Siri on, because I've literally never had an issue with dictation or being misheard. And I use it a lot!
(It misunderstands my wife from California all the time, though.)
> I think that’s why they haven’t jumped so far because they have hundreds of millions of users who have daily habits that they don’t want to lightly disturb.
I don't think that's part of their decision making, Liquid Glass moved most things around for seemingly not much else than novelty and that's not the first time.
They have done this before, release something large early in anticipation of a major shift and iron out issues before the shift happens. Liquid Glass started off a little janky but they appear to have been ironing out initial issues with each update.
From what I understand (which might be wrong), Liquid Glass was at least partially inspired by visionOS and "spatial computing". And I guess on that platform it might make sense for some use cases.
That doesn't change the fact that I can hardly read some of the user interface in Apple Music for example.
It's not that the idea is bad, but it's badly executed.
Really? None of my issues are fixed. The settings panel still has a massive gray empty chunk hanging off the bottom which makes it look like a 13 year old coded it...
Agreed. I vaguely remember another HN link that said Apple tried a competing-team approach to building a better siri, but it fell apart due to internal politics reasons?
WhisprFlow produces much better speech-to-text for long text messaging-by-voice (dictation / transcription) than apple's speech-to-text does. Whisper models in general seem to do a lot better than most built-into-OS/app models. Which is interesting, because there's nothing stopping them from just using Whisper models.
I love MacWhisper personally. Also, Gumroad is a fantastic app distribution platform for my personal values.
As far the "decision tree" side ... there's not much that can be done about that now. Agentic agents still go too "off-the-rails" to be productionized out to the billions of smartphones of the world. I'm working on voice-controlled agentic-with-rails AI features for my HomeAssistant, because Alexa / Google Home suck. But that's a hobby project and rogue AI actions only affect me, not billions of customers.
Right now Alexa+ and Gemini are objectively better.
The best is ChatGPT voice mode. It understands non English words and accents amazingly well, and even though the LLM model isn’t the full fledged one, I can have deep conversations with it for an hour without it missing a beat.
Speech to text should work. I regularly have to manually edit the transcribed input. The more special words the more frequent. Completely disregards the context of the current input, for example, on Hacker news might involve special technical and IT vocabulary.
The iOS equivalent would be Shortcuts, which, while not as powerful as Tasker depending on the context, is an official Apple feature that most apps support. Claude and ChatGPT both have various Shortcuts hooks, including voice conversation.
Alexa+ has been a massive downgrade for me. It's extremely laggy and constantly misunderstands me, whereas the old one never did. "Set a timer for 20 minutes" used to be instant and just work, I did this the other day and it took 10 seconds to respond and set a timer for 10 minutes.
Same here. I can see why LLM-driven voice assistants makes sense to product people in the abstract, but introducing non-deterministic behavior into a device I primarily use to help with timekeeping and control lights is nothing but a regression.
Alexa+ is terrible compared to Alexa. It's so bad that I've dusted off my v1 echos cuz they're too old to run Alexa+. Complete shit show that is.
I do like Gemini better than Assistant, even though it's not quite there yet. But that's just a matter of time because they actually designed it from the ground up to be a drop in replacement for Assistant.
My preference, however, is for a voice-control UX just like I get with my Amazon Echo and "classic" Alexa like I have been for the past 10 years I've been using it: I think I can best describe it as a "voice-driven command-line" just like your OS' CLI shell, which makes its interactions predictable, even if it means I need to "know" what commands are valid in a given context. We all need predictability and reliability when it comes to my home-automation integrations.
...but computer interaction with a LLM / transformer-driven / "AI agent" is anything but predictable. When Amazon opted everyone into Alexa+ I agreed to give it a go and see if it really made things better or not - and it did not. I opted-out of Alexa+ and went back to something actually reliable.
Here's a question: I don't understand the gap between these LLM powered voice agents vs CLI coding agents, the latter of which are obviously useful and quite resourceful at getting something done when asked in plain English.
Seems like an agent given 20-30 tool calls like "read_sms" "matter_command", and "send_email" would be able to work out what to do for things like "set the house to 72° and text Laura that I did it."
> Seems like an agent given 20-30 tool calls like "read_sms" "matter_command", and "send_email" would be able to work out what to do for things like "set the house to 72° and text Laura that I did it."
Incidentally, a major headline in the news this past week was about a coding-agent that wiped its company's entire system, including backups; which the company's staffers were confident was utterly impossible (as it didn't have any access to that system), and yet somehow, it did[1] (the TL;DR is the agent randomly came across an unprotected God-tier admin API-key/token saved to a personal text-file in a filesystem it had read-access to). If an agent can do that with only read-only access to a company's routine/everyday storage area then there's no way I'm giving it the ability to deactivate my house's fire-alarms and security-cameras via Google Home/Matter/Thread/HomeKit/X10/OhFfsNotAnotherCloudBasedAutomationScheme.
If you are really worried about that, the agent already has that access since itll go find that key anyways.
the HN thread about that case was much more of a "why are you putting your prod keys in random text files" and "the sota in prompt engineering is that putting DONT FUCKING DO THE BAD THING" makes the agent more desperate to get stuff done
putting limits at the harness level would do just fine. one LLM call, one tool call per voice message.
Siri's one job I care about is doing exactly what I want while I'm driving. I need it to check my text messages, take dictation, start phone calls and deal with music. I don't need to have conversations with it, I need deterministic responses to known commands.
Whenever I see one of these comments, it's always from someone that tried it at the start and then gave up because of a bad experience. And many times there are more people commenting back that this was essentially the 1.0 version and that the current 2.0 version is much better. So as someone that uses none of these products (old voice assistants vs. ai ones) it's really hard to evaluate if any of these anecdotes mean anything.
You could have tried Alexa+ at the start when it was shitty compared to plain Alexa, and maybe it's better now. But equally none of the people that comment that it is "amazing" in its current iteration qualify their statements with their experiences comparing and contrasting the old version vs. the new version making them seem either unqualified to make statements based on how much "better" it is than the old version or at worse they are shills (paid or not). The best take is that they are comparing (e.g.) day-one Alexa+ vs. the current Alexa+ without a comparison to the original Alexa.
... which is to say that it really feels like there are no clear conclusions that could be drawn from all of this.
I'm not an Alexa user myself but I have watched my wife interact with it for around 5years now.
The new Alexa powered by an LLM is objectively better that previous Alexa in a few ways. This much was apparently from day one and has only gotten smoother.
1. It can reliably execute direct or vague-ish commands "play X movie in app Y" or "play x show" and can infer X movie is only available in app Z so use that.
> It can reliably execute direct or vague-ish commands "play X movie in app Y" or "play x show" and can infer X movie is only available in app Z so use that.
...how does that work, exactly? (or rather: what's the context here?); there's no possible way for an Alexa+-powered Amazon Echo to control my AppleTV or interface with VLC on my desktop.
No matter how good the LLM features are, I just want to turn my lights on and off and check the time. A perfect LLM could maybe perform on par with a simple deterministic command system for these tasks, but not better. All an LLM does is introduce the possibility that a command that worked fine yesterday will randomly not work
Also, one of my first interactions with this Alexa+ thing was “how long is it until 8:45am”, one of only a few commands I use it for to work out how much sleep I’m getting, and it proceeded to ask me what the current time was… I immediately turned it off after that
> All an LLM does is introduce the possibility that a command that worked fine yesterday will randomly not work
Aren't hallucinations part of GenAI? I would assume that "AI" voice recognition doesn't have that baked in, but I'm not working in either of those spaces so maybe I'm missing the details. So many things are being looped into the "AI" umbrella that would have just been called machine learning or pattern recognition a decade ago (e.g. "facial recognition" vs "AI" at a time when "AI" also means chatbots like ChatGPT).
> that tried it at the start and then gave up because of a bad experience
I've had enough bad experiences with products that never got better, or just got worse (Exhibit A: Windows 11). Like most primates, I am capable of learning, and I've learned that once a consumer product/service goes bad there's little hope of a turn-around. I accept that you're telling me that it's gotten better, but of the people I know IRL who also use an Echo, none of them have told me that Alexa+ is worth trying, let alone committing to.
Yes, it's on me for not giving Alexa+ a second chance, but I'm not willing to give Alexa+ a second chance because, as a technology product/service customer, I just don't feel respected by the industry I work for (...lol); if Amazon, Microsoft, Google, et al won't respect me, why should I venture outside my comfort-zone for... what benefit, exactly?
> I accept that you're telling me that it's gotten better,
I'm not telling you this. I'm basically saying that with Alexa/Alexa+ and with Google's Gemini vs Goole Now(?) I've seen many posts like this. Where someone complains about the AI version, but then there are other posts that come in and claim how much better it is. Even for things like Claude Code you get people complaining about how many mistakes it makes, and then people coming in and saying that it's because they are "doing it wrong". Either "Claude has improved by 10x in the last 6 months. It's so amazing! If you used it a year or so ago it doesn't even compare!" or "You aren't using the most expensive tier of Claude which increases context and thinking abilities that are hobbled in the cheaper versions!"
I never really see a comparison on the same level and it sounds like people talking past each other or some people having legitimate complaints and then others coming in to shill for a product.
I'm not in anyway implying that "You should totally try this out now that they fixed everything" or anything of the sort. I even stated that I don't use any of these tools, and I was commenting as something more akin to an "outsider."
It's not the early 2000s where just messing around and wasting time on this stuff is cool in itself. None of that time wasted turned into much long term apps that stuck with me. Maybe a banking app and a trail running app.
I ruined multiple dinners with timers that didn't work (with a time/labor cost).
I had to get out of bed in the freezing to turn the lights out. It's easy to hit the lights when I go to bed but annoying having the tool fail and getting back out.
Music stuff didn't work well because I used Youtube Music not Spotify.
Those were my 3 use cases for Google voice, and it failed them all enough I just stopped using it all together. Who cares if it works today if in another month they just change something and break it again? They've shown it's not a tool to use for tool things, it's a 'gee wow' thing. I don't need to be impressed. I need not burnt food.
I concur that the ChatGPT voice mode is excellent. I can't even think of anything to knock it for other than for whatever reason it never 'hears' my kids, but that's probably because it's not intended to be used in multi-participant chats?
But for one-on-one, it is a really outstanding experience. Especially since they tamped down the way over-the-top humanisms.
Strong disagree. The upgrade was a little bit rough at first (mostly because of slow response) but now it's a million times better than the old assistant. The old assistant basically just repeated "I don't know how to do that" over and over.
I have never had trouble setting timers with either.
The new one was 100% failure to do anything with timers for me. I never saw it work once. If I had ever gotten that to work at all, I may not have uninstalled it, and might have a different impression now. I cannot account for why our experiences are so different.
Oh man. I made the mistake of converting my Google Home devices to Gemini.
The first problem is that it's just slow. If I want it to turn off some light, it takes a long time before responding.
But yeah, the failure to do basic tasks. I have a routine that I used to have it run (controls several devices at once). Now:
10-20% of the time it runs it.
60% of the time it says it's running it but it doesn't do anything.
20-30% of the time it says it can't do it unless I opt in to invasive permissions. And when I opted into them, it still failed about a third of the time. So I opted out again.
I don't know if it's related to Gemini, but sometimes the Android Auto tells me "I don't have permission to do that" simultaneously with actually doing the thing that is allegedly lacking permission. Sometimes I want to move off the grid.
Man, I hate touch screens. And I hate Android Auto. My previous car had an aftermarket Bluetooth system (radio, etc). It was way, way better than Android Auto or any entertainment system I've seen in any car.
Your experience is valid of course, but I never once have had the inclination to have a conversation with my phone. I'm not sure which of our experiences is more common.
It’s not a conversation like you’d have with a friend, it’s the type of interaction you’d have with a chatbot, just hands-free.
To give you an example, I was having coffee the other morning while unloading the dishwasher and asked the speaker if today was a good day to apply weed and feed on my lawn. This was not possible with the old assistant and was useful to me.
I hate it too.. the old assistant is pretty smart, obviously it has some language processing, but not "AI", but it's very fast for things like "Set alarm for ... ", "Remind me at X about Y", "Add calendar event on x at y about z", or "Navigate home".
And now if I want to use Gemini on my phone I have to replace Assistant. Nah, I'll keep Assistant thanks, and just have a shortcut to load the Gemini in the browser.
Except the browser experience is so fucking buggy, constant reloads needed..
My android phone was so much better for voice-to-everything. Whether it was transcribing my voice for text messages, or doing looks on the internet. Siri is just so bad.
Still love not having google's paws all over my data, though, so not going back.
Claude.. I switched my phone assistent to claude and it does everything that google (used to) do like set alarms and timers, but also does everything claude can do.
The only thing I haven't been able to get it to do is read from my phone's local calendar. The claude app can but the voice assistant cannot (Why? No idea). Perplexity has no issue doing it so I actually use them for my rare needs to do voice commands with my phone.
Actually, could you recommend one? The ones I've found all seem to want subscriptions. I'm okay paying a few dollars for a well done frontend, but an ongoing sub to run an open weights model locally is nuts...
Plus, if someone else does it better (or different), I bet they've got a team and technology at a 90% done state waiting to jump on it, pick it apart and make it better. I don't think they're not doing anything.
It’s not “transformatively better” but it definitely involves fewer frustrations to interact with. That’s always been Apple’s main value proposition, you’re not getting the most cutting-edge stuff but you’re supposed to have something that “just works” not something that makes you go “GODDAMN IT!” when it inexplicably seems to fumble normal things.
So if you buy Apple products based on that value proposition it’s a big problem for Apple if they can’t seem to keep their brand-promise in this area.
Yesterday my google home mini gave me the current temperature in farenheit. I live in Canada and use a pixel. Dumbest fucking AI going. May as well give it to me in coulombs per hectare.
Not sure "sitting out" is the right way to put it. They've been publicly trying to ship a next-gen Siri for years and haven't been able to get something good enough to release. The latest plan is to base it on Gemini so we should be seeing progress on that next month at WWDC.
The experience of using LLMs as digital assistants so far is not great. Gemini on Android sucks so bad it's hard to describe. It can't tell what its own capabilities are, it can't inspect the states of the apps it manipulates, it hallucinates constantly, and it needs more handholding than the crappy old decision tree to do the right thing. I much more often have to pull over to make sure Google Maps is doing the right thing than I ever used to before, because trusting the LLM to be "smarter" so often fails for me.
Because the competitor voice models sound good but are dumb upon any scrutiny
ChatGPT’s voice model has a great user experience and seems like it is seamlessly integrated into the chat, but its actually a far smaller and dumber model. @husk.irl on instagram has videos displaying how dumb and undiscerning it is
People were wowed by the magic at one point, but its faded. Apple avoids those things and the limitations havent been solved
I think they could never make it good enough at the right price.
You have to remember all of the AI companies are making cash bonfires. People aren't going to stop buying iPhones because Siri can only do what it does now.
If Apple focuses on hardware and skips the pay-for-inference bubble they'll come out the other side with the best consumer hardware everybody already has for local inference which is going to eat the whole industry's lunch.
nvidia is going to have a hard time convincing people they need to buy $1000 LLM inference hardware. Apple isn't going to have a hard time convincing people to buy the next generation of phone/tablet/laptop.
I think it's the same reason why MacOS and iOS degraded a lot in terms of UX the past decade. The focus of Apple shifted towards hardware independence.
The 2010s was marked by Intel's lazy product lineup, year after year pumping rehashes of older products, iterating on top of their 14nm lithography with increasingly minor improvements on its architecture until AMD overcame them. In the process, Apple's partnership with Intel became a liability it had to solve, and a push for the unified ARM architecture was no small feat.
If you ask me I don't think it's justified to degrade the user experience for the sake of focusing on this. It's a trillion dollar company, and has been for a while. Sure it could have tackled both, but what do I know.
In any case I think it explains really well why Siri feels so abandoned.
I dunno, Apple has always had a pretty high level of hardware independence, and one could imagine even if Intel did produce great chips for longer the ARM architecture would replace it eventually. Certainly the timeline got shifted (and I'm glad for it) but I don't know if that really impacted Siri. If anything it seems like it got pushed to the bottom of the pile in favor of projects like the Apple Car and Vision Pro OS one on side and the demand to increase services revenue on the other.
Also: Before, Apple was dependent on Intel (whose "product" is an integrated chip design and the fab to make it). Now they're dependent on TSMC (whose "product" is a fab). I'm... not really sure they've reduced their dependence? If TSMC starts falling behind Intel--which doesn't seem likely, but what happened to Intel didn't seem likely two decades ago--Apple will be stuck.
It's one of the biggest and wealthiest companies in the world, but your comment seems to imply they have to pick and choose what they pursue. They really don't, especially if it's hard- vs software.
> seems to imply they have to pick and choose what they pursue. They really don't, especially if it's hard- vs software.
Money can often just be one part of the equation.
To do things well you also need - available & capable technical resource, suitable facilities, available & capable leadership and management (with engaging at the right level in the business) and a clear vision of what you're trying to achieve/working towards.
Given how Apple appears to operate, I wonder if a strong desire for senior management control/oversight over major developments means they (artificially) limit how many concurrent large-scale things they can work on at any given time?
> It's a trillion dollar company, and has been for a while. Sure it could have tackled both, but what do I know.
I didn't imply, it's explicit in my comment. it's what their actions show. Their updates make their systems worse and worse, Tim Cook is out and Siri is in shambles. It might have been something else, but I'm willing to give it the benefit of the doubt, because the alternative is just sheer stupidity.
They're valued at $4T, they have hundreds of billions hoarded. They could run 50 billion dollar startup projects and not feel it. Imagine a startup getting handed a billion dollars ... and the vast knowledge that Apple has access to already.
There's no way they couldn't do a better Siri. For some reason, they just ... won't.
Heavily funded startups have terrible track records in reality. The only cases where it seems to have worked is when the money was used to undermine the market dynamics by nuking competition via severe underpricing.
here is a clue delivered -- money does not make software better, and lots of money often results in worse.. it makes no sense? actual experience begs to differ.
Classical homework assignment -- the Mythical Man Month and related essays
Money is a means to an end, so that's true. Just because you have a screwdriver does not mean you will drive screws. You have to have someone who can use the screwdriver, knows righty-tighty, wants to drive the screws, etc. Stupidly throwing money at a problem can get you places, but the efficiency can also drop to near zero. The problem is we're talking about a quantity of money where you don't need to be highly efficient. Savants can do pioneering work in a cave with a box of scraps, but you don't have to strive for that kind of austere efficiency. Nobody is expecting that.
If Apple can't harness the potential of the currently overfilled labor pool, that indicates a systemic issue within Apple. The entire raison d'etre of management structures within a business is to increase efficiency of capital to drive productive forces. If they cannot do that, then that would indicate an extremely problematic competency crisis within Apple's management organ.
This kind of failure when you are a company with the valuation of a first world country's GDP should be raising alarm bells in any rational person's mind.
I only partly agree with this. The answer is maddeningly more complicated.
Some parts of their software stack -- higher up than the kernel -- are actually pretty great. There's a lot of realy brilliant stuff in their system frameworks, and in SwiftUI, Cocoa, and UIKit. I've been using Linux at home recently, and I find myself missing some of it.
But, on the flip side, suddenly you hid maddening bugs, crashes, or terrible developer-experience papercuts. And, of course, there's the App Store, which is just evil. For my next app I'm just going to go Notarization only, and see how that goes...
The comment above is on to something. I find CarPlay to much more valuable and much more of a lock in to the iPhone than Siri. I do not think I could ever go back to using the infotainment systems that ship with cars. So makes sense why they might prioritize over Siri. And in the context of CarPlay, the simplicity of Siri is nice. I really only need it to execute a few simple commands like looking up directions, making calls, reading / sending texts, playing a podcast, etc.
I don't dispute that, but Apple made its business on the premise of being the best in the business in terms of UX. Note though that you can have great UX powered by mediocre software, so those aren't mutually exclusive.
Apple’s Mac software in the 90s had great UX and very underwhelming and old fashioned kernel software which they struggled to replace. Jobs knew this and did the work externally with NeXT.
They fast-follow then market so aggressively with just enough proprietary tweaks so they can trademark it that people think that Apple invented the technology.
People end up thinking Apple invented something because they tend to make the first usable version of something that could appeal to the general population.
...are we going to pretend smartphones didn't exist before iPhone was launched? I think I was on my 3rd one when the iPhone came out and even then it was a luxury toy for the rich, I didn't know anyone who actually had an iPhone for a good few years after they came out.
It seems to be Blu-ray vs HD-DVD again. Luckily for me, I made the right decision and got out of the shiny round disc business as that battle was raging all around me having been in the DVD programming business for 8 years or so. This battle of LLMs is interesting to watch from the sidelines as I have nothing to do with them. Not sure this will end with one LLM to rule them all while the others fade away. People can use the one they prefer and not really impact others.
>Not participating in the war is the only true way to win the war, nothing new.
Really not true both in real wars and in tech wars. There's no evidence to support this claim.
Android only exists as the dominant mobile platform because it went to full scale war with Apple when the iPhone launched. Those that didn't take part and came after the battle have like <1% market share and Apple and Google are printing money from the cut to their app stores.
Apple doesn't take part in the AI race because whichever AI wins the war in the end, they'll have to be on their Appstore to reach the users, so Apple wins regardless due to their Appstore monopoly. AIs are no threat to their phone, laptops and Appstore business.
But Google can't afford not to take part in this race because AIs are a threat to their search and ads business.
Same with real wars, US is the world superpower because it got involved in WW2 even though it didn't have to be. Same with Russia and Ukraine, provided they don't wipe each other out scorched earth, their militaries will be the most advanced on the planet on modern drone warfare they invented after the war is over, and every other military on the planet will be paying them for their gear and expertise, which they already are.
I'm suspicious of that take from Mark Gurman. That's a lot of detail around pricing and "holding Apple over a barrel" as relates to the Siri deal that seems like a nice PR spin from Anthropic.
Anthropic probably couldn't give the uptime guarantees that Google can, right?
Apple is a pretty difficult company to deal with on a B2B basis.
If you have terms that conflict with theirs, they aren’t very flexible. Anthropic can be similarly difficult, and their needs from a business perspective probably don’t align with Siri. I would imagine that Google has a more flexible/long term approach to absorbing some risk in a revenue share arrangement than anthropic who generally wants cash.
Anthropic’s only purpose is to juice whatever KPI‘s are gonna increase their IPO market cap.
The last sentence doesn't make that much sense to me though. An agreement with Apple to be the lead AI partner would likely juice the IPO a great deal. The financial details wouldn't matter much for the IPO (as the initial financial commitments are going to be small but the halo effect would be real - I think it would in the market anyway).
I think Anthropic has real commitment to their way of doing things which can cause short term issues (and hurt the IPO). And they seem willing to keep those values rather than just making deals to pump the IPO. As you say Apple also sticks to their way of doing things even if it frustrates their partners.
I think not being the lead partner with Apple may well be good for Anthropic long term. But if all you cared about was the IPO just agreeing to Apple's terms likely would have been the best option.
These SpaceX, Anthropic and Open AI possible IPOs are so extreme it is hard to make judgements about them; so maybe there are Anthropic IPO issues to an Apple agreement that I don't appreciate.
It depends on what Apple wants and what Google was willing to give. Google is in many ways the weakest player in the individual-user facing space.
It's a weird market and these companies want global domination. TBH, i don't have the knowledge or context to understand how to think in that mode and what the real facts are.
I wouldn't put much stock in the deeply held principles of Anthropic (or Apple for that matter). That's an appeal to emotion. I love the product, but they're happy to randomly rug-pull the product and how it works, both in the publicly available products and other contexts. It's just another company.
You say that, but don’t you think at this point they actually believe some of the stuff they say about safety and the future of humanity? It’s tough in this day and age not to be overly cynical but they did draw a line in the sand at the DoD and that wasn’t for IPO numbers…
The US government in 2026 is openly and cravenly corrupt, and I don't believe anything at face value. The story about the targeting may be real and material, or backwards engineered to fit the reality. OpenAI is aligned with Larry Ellison and Oracle, and given the favor granted to them by the government, I'd look to that relationship first.
Yeah, that makes more sense to me than "Anthropic had them over the barrel". Which seemed quite odd given the relative cash positions and installed base of each firm.
Gueman might be the only leaker in tech who, so far, doesn’t seem to fuck around. Low miss rate, rarely exaggerates. Of course that could change and he could always get insider info that is wrong.
It's trending in that direction. If you want genuine conversation with humans, it's best to start looking for small, private communities that have and enforce LLM policies that align with your desires. Public social media is universally trash, don't waste your time there. I think HN is still worth visiting for now, but it's getting harder to justify spending time here with the quantity of garbage-quality LLM articles and even many comments.
> HN is still worth visiting, but it's getting harder to justify spending time here
I feel the same. Quality of both submissions and discussions have considerable decreased. It is still the best general purpose “aggregator” I know of, but it is not what it was. It is becoming more and more FotM hype and boring group-think.
HN was great due to the breadth of unique, interesting, nerdy topics, most of which I would have never come across on my own; and the insightful thought-provoking commentary, often by insiders with unique insights and perspectives.
Now it is just the same LLM agentic coding harness hype cycle astroturfing 100x engineer 37k LoC/day BS I could get from Reddit or LinkedIn or Twitter or anywhere else.
The moderators are still doing a fantastic job though! I feel like that is the last big differentiator from just being orange Reddit.
I dunno, it's tough. I hesitate to say HN is "getting worse," even if I agree with that in my gut. I think that gives rose-tinted glasses and nostalgia-bait. Rather, I think the community is refocusing around something that I find uninteresting. If you find LLM output to be dull, as I do, it's less and less a place for you to be. I try to push the community in more interesting directions by upvoting articles with actual technical content, but yeah it's being drowned out by the ho-hum LLM output that I'm not interested in, and that means I want to be here less.
I could not agree more. I feel the exact same, its just a ton of content here that might not necessarily be "worse" I just find it (LLMs) dreadfully boring uninteresting. Lobste.rs seems to be nicer so I lurk there a lot now as I can't post.
I think it's a trend in the industry though. Engineering is known as a moneymaker and so a large part of the new generation is the kind of person that decades ago would have gone for finance as a profession.
Both the really old timey graybeard techies and the green haired alternative techie communities are reducing in numbers.
Since crypto and later LLM it got to its current state, everyone is trying to promote their stuff, sometimes in many covert ways, again, when money gets into anything it ruins it, same happened to YT and other sites.
These points might be fake, but they are far from being useless, and actually have monetary value.
There is a market for buying and selling "aged" Hacker News accounts (3 USD <-> 15 USD for ~500 points) and upvotes / downvotes
By purchasing just ~300 karma points, founders can unlock an uplift of tens of thousands of dollars in visibility on the home page (clients and investors).
So the LLM comments are not here just for fun, they are clearly farming points.
Ironically, it also increases actual human engagement. This way the day Ycombinator wants to announce something, they already have more public than if there was low engagement.
Like the shilling you mentioned, these bots can push downvotes and flag competitors service.
Essentially the same as on Reddit. If you have incentive, you have a market.
If you're going to make a claim that there is a market for aged HN accounts you need to back it up with sources/proof, otherwise, you're pulling nonsense out of your ass
being able to buy upvotes is not proof that there is a market for buying/selling aged hn accounts... which is what you claimed and what I asked for proof of.
Yuck indeed. I do find it offensive when someone uses AI in a conversational manner. It's one thing to use it to chuck up content on social media to attract eyeballs, but this is a forum intended for conversation.
Yes, and just because someone else has been dumping trash in the woods doesn't mean you should.
That said, the social media feeds are so trash filled that I avoid them; it's extremely depressing opening up an incognito youtube and seeing what Google thinks will monetize well for an average consumer.
We're getting to a point where we're going to have to consistently start putting content in that AI is banned from writing, just to prove that we're humans
I recently preordered Cory Doctorow's book dealing with this: The Reverse Centaur's Guide to Life After AI.
The title refers to most machinery being a "centaur," meaning a thinking human is carried by the machine doing the heavy lifting, while the goal of AI companies is to replace high value work with the opposite. They want to turn people into meat appendages that serve unthinking machines.
Only a matter of time (if not already) before there's counter-LLMs or whatnot that convince free-reign LLM agents to go and generate cryptocurrencies for the attacker or run propaganda campaigns.
> Do people love being a hollow puppet for LLMs to fill in? Have people lost their identity?
The first question, answer is yes - most people live their lives mindlessly, with or w/o LLMs (think every idiot you knew 20 years go throwing in punch lines from "Friends" to sound "funny"), To the second question - most people have a twisted view of identity. It is supposed to mean something identifying you uniquely,but to the most people it means, identifying you as a member of a large group (nationality/political view/religion/major music genre you like). So, now when every proverbial Dick, Tom and Harry use LLMs to generate Confluence content with shiny emojis, what are the proverbial Emily or John to do? Of course, they will adopt this new identity - its who people are now - shallow, hollow puppets for LLMs to fill in.
And to think of the irony - mother Nature perfected this super-efficient, low energy and highly capable thinking machine, each and every one of us holds in their skull. Its already put us on the moon once, before we even had a semblance of a functioning computer! And we choose to throw it away, for fucking what? Verbal diarrhea and pain inducing coloured walls of texts?
All so some retarded antisocial VC-funded "AI founder" can call themselves a tech visionary?
And when Called out they’ll use some excuse like oh I use it to fix grammar or translation. No, it’s completely obvious they’re being that lazy. I’d rather read comments with mistakes than LLM slop.
No, I don't think people love that, I think it's in the LLM company and the bourgeoise class of people who push and shove AI down everyones' throat for more money and control though to puppetize people. I mean, like it's been an active part of leadership history and much of what shaped our times today: people get comfortable and even self grandiose with their place in life, and to hold on to and further their power its not hard to see others as below them and use their power and influence to do things that are otherwise harmful to others.
The lost of identity is imo this. It's people being given horrible harmful options for their meaning, health and wellbeing and so we get a general sense of most people being lost. Lost in identity as you asked, though I think it's more than that. In my initiatory work with men (being initiated, not initiating others) we learn that part of the breakdown in this for most people is being given harmful identity frameworks of dependency and reliance on others. In the initiatory process we learned an identity of service beyond ourselves through deep embodiment, and exercise and practice beyond just an intellectual grokking of it, edit: this is what we used to have through human history but today now as is described in the works most people have only what would be called pseudo-initiations (marriage, school graduations, children & work changes) which do not meaningfully contributing to meaning, contribution or purpose.
What most of us have today and what the AI companies want us to believe: We will give you the money to live (though of course, when you're truly dependent on others, and they see no purpose or value for you and even your entertainment value has gone, why would they keep you around?)
If I were a sociopath who didn’t care at all about the commons I’d be ruining by doing so, I suppose I’d find it intellectually interesting to set up a ClaudeyLemonZest and see how people react to various settings.
I wouldn't even think that CLAUDE.md would make it into source control, let alone into the product. I don't AI-code for a living, so I don't know what is considered best practices, but I would think that CLAUDE.md, AGENTS.md, REQUIREMETNS.md, MY_PLAN.md, THIS_STUFF.md, THAT_THING.md, all the instruction/feeder files that drive the AI should not go into source control. Only the actual code that gets compiled.
I look at all those files the same way as IDE configuration cruft--it's workstation-specific configuration that shouldn't even go into source control. I would .gitignore all of those files. Is this not what is done in industry?
EDIT: Wow, thanks for all the replies. Very eye-opening to see what's happening outside of my hobby-experimentation with the technology. I was coming at it with the assumption that 1-2 out of 20 people on the team were using CLAUDE.md, so why have it in source control. But if all 20 people are using it, I can see the benefits. This reply chain has really opened my eyes, thank you HN.
I think it makes sense to include in source control, just as it’s pretty typical to include documentation (such as a readme file) in source control. CLAUDE.md is really just project documentation.
If you're committed to Anthropic at an organizational level, there's no point to have a 'standard' AGENTS.md with a CLAUDE.md layer on top. Just commit the CLAUDE.md.
I’ve always struggled with what should be in Claude.md that doesn’t belong in readme.md or a similar supporting file.
I tend to include a well documented justfile, so between the readme and that common commands are covered. If there’s a style guide it should be its own file, or summarized in the readme.
If Claude is making errors I tend to just update my global Claude file, but I haven’t updated it in 6 months — only to disable Claude signatures on generated commit messages.
That's like 10% of the reason why people would commit CLAUDE.md…
The number one reason is, you are on a 10-dev tea and it just doesn't make sense for everyone to waste their token budget creating separate instances of this file, which an also requires ingesting the othe whole repo... That is 50, 60%.
The other bit is that you have a review pipeline hooked into CI/CD, and it is the easiest way to tell the bot how to review your code.
It is super valuable to have your agent files in version control, both because it is useful to be able to revert to previous state and have your AI know where you are, and because being able to freshly clone a repo and have your AI know everything is very helpful.
In my personal and professional experience CLAUDE.md will be set up with workspace/project specific info that any agent on anyone's computer needs to know:
* what the repo actually is ("this is a rust application that does XYZ", "this is a internal tooling platform")
* how it's structured so the agent knows where to look
* code and review standards
* rules ("don't automatically run formatters/linters", "don't touch dependencies")
IntelliJ's .idea/ folder has its own .gitignore and Copilot expects to find things committed under a .github/ folder.
I used to be a purist about IDE configurations, but if everyone isn't on the same page about formatting and stuff like that you see a lot of file churn as things move around.
I would have said the same thing about the .github/ folder, but I've had to add things to it to prevent Copilot from thinking bad patterns in existing code are actually good patterns that should be repeated.
It makes more sense when your communication between teammates is constrained to the repository, because your other communication channels are already saturated. They're meta concerns that really have nowhere to go outside the repository without getting lost.
.idea was designed to be added to source control. It doesnt have to be, but everyone on the team using the same project configuration has its advantages. Code style can be checked in too, reducing or preventing the churn you speak of.
Other GitHub metadata goes into the .github folder as well. And that is expected to be commuted. Like action workflows/actions and CodeOwners Pull and issue templates etc.
I can see it happening. It's very easy to drag and drop a file into an Xcode project and when the dialog pops up asking if you want it to be added to the target app bundle you just hit OK, not realizing what you just did. I've done it before with a document file but caught it before I shipped by inspecting the app bundle output.
To be fair, most IDEs will usually try to commit their own workspace configurations to a git repo unless you tell them off with a .gitignore. They tend to also exclude themselves from gitignore presets for much the same reason.
VS Code is one notorious offender in that realm; it will try to commit settings.json, even if their gitignore's are set up to ignore all other cruft.
In general, the question of what should go in the source folder is a bit of a mess. Source code, README and License make enough sense, but what about files describing project governance or CI configuration logic? Or what about files that are used to make the forge you're using render the repository in a certain way (for example: bug tracker templates). Those are all cruft insofar that they have nothing to do with code, but it's generally agreed on that you're supposed to commit those, maybe in a dot-folder if necessary.
I agree. The intent is sacred. This should be the default and CLIs should make use of the available history (while preserving inputs you need to preserve outputs too because context matters).
The idea of having to repeat something to your computer is ridiculous.
If your coworker needs to make a change, those md files can capture a lot of elements of design intent and known gotchas that are otherwise latent or implicit. That’s kind of what comments are for, to say nothing of blindingly obvious design, but …if everyone else using the same tool, sharing a tool native file makes sense in the same way that checking an IDE workspace file can.
I personally don't have strong experience here , but I would treat them similar to BUILD files and the like - probably in the root directory of a repo but nowhere near the bin/ or build/ directories.
Also it looks like there's a compilation step to these files, which is interesting. The raw file was included, not the environment specific file.
Nah. That’s not how it looks once you start working with it. Its code-equivalent for sure. You probably would not keep your plan files or the working chats though.
Agent instruction files are code, though. And none of this is really workstation-specific, it is codebase-specific. Should each developer keep a nearly identical copy of CLAUDE.md? The instructions really aren't for a developer, they are for an LLM agent. In most cases (I'd imagine, anyway) the agentic instruction files must be in source control for them to even provide much value.
Anything that goes to production should have a 4-6+ eyes rule, at least one reviewer that can review the changes in isolation.
If tools or LLMs can help them with it then that's fine, but it should always be at least two humans involved, one making changes, one verifying, and if something like this happens, they're both culpable. Not that they should be blamed for it per se, but the process and their way of working should be reviewed.
I cringe whenever someone suggests to just have an agent review because “it knows code better”. An ai agent wouldn’t catch a lot of things a human would flag. And before someone goes you just need to prompt it better, that’s a huge amount of work for large projects and you’re still essentially begging it to do what you want.
I have not encountered anything more soulcrushing in my entire career than having to spend hours going over LLM generated slop that was vomitted out by a contractor in Pakistan that doesn’t give a shit, to only have the review itself be fed in as a re-prompt, and get the same 2000 line ball of spaghetti back with even more issues and going back and forth until I just give up and approve it.
No, AI code review doesn’t help. Claude can’t even give me correct line numbers 80% of the time, literally just makes them up, and more than half of it is false positive BS anyway.
Yep I’ve had to approve bad code too due to timelines and now our codebase has so much tech debt it doesn’t even matter anymore. And worse, as new people work on the code the LLMs pick up the bad code and it’s been spiraling from there.
The problem is that humans inherently fill in data in what the process from the world.
Our brain is designed to fill in gaps, it's why memory is so blurry when it comes to reciting the facts of what we saw in a trial.
It's why you could swear you saw "x" in the production software you were about to push. But it really comes down to expectations - and those expectations help reduce cognitive load/increase cognitive efficiency (resource usage).
So after more and more people get used using AI, you will see these mistakes occur more frequently. B/c it's how our brains work.
it's even worse than that - people no longer know how to check what they commit, and often they don't even know what to check for. naive people are being placed into positions that they are completely ignorant about.
I’ve been wondering if vibe coding was responsible for the recent introduction of acoustic echo cancellation (AEC) bugs in FaceTime (muting and unmuting the microphone appears to temporarily fix it). Apple has always had excellent AEC in my experience, it’s sad to see them breaking a fundamental phone function.
Some people are living in a different universe. Every single tech company I know is pivoting their entire company to AI based software development. Its in the performance evals, the wallet is fully open to use tokens to experiment, every practice and every process is open for re-evaluation. Its all gas no breaks everywhere. The conversation on the internet does not seem to realize this? Or is in denial.
I think OP's comment comes from the "Think Different" mysticism that used to be around Apple. You'd think that if there was one company on the planet not embracing slop, it would be Apple, and the realization that it's not the case can be a bummer.
"slop" and "vibe coding" are derogatory terms about the level of effort - e.g. little to no human review of quality or accuracy, or accountability/concern related to the output.
"What a computer is to me is it's the most remarkable tool that we've ever come up with, and it's the equivalent of a bicycle for our minds." — Steve Jobs.
I'm also not sure why you'd think that, Apple's been at the forefront of "AI" for years now, running models locally and optimizing their CPUs for local workloads to e.g. identify people, places and pets (much appreciated lmao), create slideshows, and subtly improve photo's made on the device.
Considering that XCode supports using Claude directly, I'm not surprised to a degree. I'm more surprised it was not blocked out by whatever build tooling they use.
Whilst tempting, I think it is important not to read too much into this.
It is no secret that Apple has an enormous R&D budget.
It is no secret that Apple operates with hundreds of siloed teams in order to maintain individual domain expertise. The teams then come together in a collaborative manner to bring together the final products.
So yes, it is likely true that SOME teams use SOME LLM for SOME tasks. It is a viable argument from R&D and other perspectives. Apple is an enormous multinational company, it is unlikely they have zero-AI on-site.
What is guaranteed NOT to be the case is that Apple is somehow vibecoding company-wide. Old-school engineering is too important for Apple.
I'm sure journalists and Anthropic would love to have you believe otherwise, but I think we need to keep our feet on the ground here and accept the reality is more old-school.
Afterall, as others have pointed out already here ... whilst the rest of Silicon Valley has been shoveling truckloads of cash at AI, Apple have been patiently sitting, watching the bandwagon trundle along the rails.
> It is no secret that Apple operates with hundreds of siloed teams in order to maintain individual domain expertise. The teams then come together in a collaborative manner to bring together the final products.
Having worked there this is a perfect description of the organization from my experience.
> So yes, it is likely true that SOME teams use SOME LLM for SOME tasks. It is a viable argument from R&D and other perspectives.
> What is almost guaranteed NOT to be the case is that Apple is somehow vibecoding company-wide.
Not really, almost all active software developers use AI nowadays.
The research surveyed 121.000 developers across 450+ companies. A striking 92.6% of them use an AI coding assistant at least once a month, and roughly 75% use one weekly
It's weird to believe that large corporations should be ashamed to use AI.
It's a standard engineering practice, otherwise it's like if you refuse autocomplete because autocomplete is not right 100% of the time.
You can include project/team based md files in your repo and exclude env/system md files (eg from you home directory, which includes your personal coding instructions).
Why bother replying if your post gets buried under AI bots with twitter blue (or whatever it's called now) that just try to farm engagement for money. Revenue sharing is a big mistake for every platform because it incentives engagement slop. Ordering by Newest first often gives you more human replies.
It’s not super secret no. It’s just embarrassing they they don’t have instructions in their AI agents coding and pushing deployments to not push the Claude.md files. It demonstrates that they haven’t fed their AI prompts through AI yet cause it would hav added a clause for that.
Have you never used Claude? It regularly ignores directives, no matter how they're worded or how many times they're repeated. It's also hierarchal. Org-wide rules would be in a higher-level directory than repo rules or component rules. This is obviously just a tiny snippet of prompts.
Is it really a mistake? OpenAI's own agent SDK also has a Claude.md file. That's not an indication that OpenAI internally use Claude, rather, it's there because the SDK has multi-model support.
I don't think you need to even see any files to realize much of Apple's software is vibe-coded by now.
Had some issues with my monitor apparently seeing connection to my Mac Mini, but the Mac Mini displaying black, apparently somehow got out of sync with my monitor, sleeping the display controller then waking it solved it.
Gathered a bunch of data, wanting to submit a report, since I'm a Apple Developer Program member since like two days ago, and I wanna be a good c̶u̶s̶t̶o̶m̶e̶r̶ user, so I opened up Feedback Assistant.
It asks me for my email, I input it, press enter. A password input appears, but keyboard focus doesn't move there automatically. I know is such a tiny nitpick practically, but tiny shit like this makes it so obvious that not a single person actually tried this UX. 10-15 years ago, Apple would never release something that isn't perfect, but now there are these UX edges absolutely everywhere across the OS.
I ended up not logging in at all, wrote my fix into a tiny fix-display.swift file which I'll run when it happens instead.
--Mark Gurman, Bloomberg https://x.com/tbpn/status/2016911797656367199
Probably smart time to rent and not buy if they plan on buying in a downturn.
(It misunderstands my wife from California all the time, though.)
I don't think that's part of their decision making, Liquid Glass moved most things around for seemingly not much else than novelty and that's not the first time.
They have done this before, release something large early in anticipation of a major shift and iron out issues before the shift happens. Liquid Glass started off a little janky but they appear to have been ironing out initial issues with each update.
That doesn't change the fact that I can hardly read some of the user interface in Apple Music for example.
It's not that the idea is bad, but it's badly executed.
Liquid Glass was Apple’s logo change moment
WhisprFlow produces much better speech-to-text for long text messaging-by-voice (dictation / transcription) than apple's speech-to-text does. Whisper models in general seem to do a lot better than most built-into-OS/app models. Which is interesting, because there's nothing stopping them from just using Whisper models.
I love MacWhisper personally. Also, Gumroad is a fantastic app distribution platform for my personal values.
https://goodsnooze.gumroad.com/l/macwhisper
As far the "decision tree" side ... there's not much that can be done about that now. Agentic agents still go too "off-the-rails" to be productionized out to the billions of smartphones of the world. I'm working on voice-controlled agentic-with-rails AI features for my HomeAssistant, because Alexa / Google Home suck. But that's a hobby project and rogue AI actions only affect me, not billions of customers.
The best is ChatGPT voice mode. It understands non English words and accents amazingly well, and even though the LLM model isn’t the full fledged one, I can have deep conversations with it for an hour without it missing a beat.
[0] https://tasker.joaoapps.com/
Things that Sam Altman would prefer people not say lol
I do like Gemini better than Assistant, even though it's not quite there yet. But that's just a matter of time because they actually designed it from the ground up to be a drop in replacement for Assistant.
My preference, however, is for a voice-control UX just like I get with my Amazon Echo and "classic" Alexa like I have been for the past 10 years I've been using it: I think I can best describe it as a "voice-driven command-line" just like your OS' CLI shell, which makes its interactions predictable, even if it means I need to "know" what commands are valid in a given context. We all need predictability and reliability when it comes to my home-automation integrations.
...but computer interaction with a LLM / transformer-driven / "AI agent" is anything but predictable. When Amazon opted everyone into Alexa+ I agreed to give it a go and see if it really made things better or not - and it did not. I opted-out of Alexa+ and went back to something actually reliable.
Seems like an agent given 20-30 tool calls like "read_sms" "matter_command", and "send_email" would be able to work out what to do for things like "set the house to 72° and text Laura that I did it."
Incidentally, a major headline in the news this past week was about a coding-agent that wiped its company's entire system, including backups; which the company's staffers were confident was utterly impossible (as it didn't have any access to that system), and yet somehow, it did[1] (the TL;DR is the agent randomly came across an unprotected God-tier admin API-key/token saved to a personal text-file in a filesystem it had read-access to). If an agent can do that with only read-only access to a company's routine/everyday storage area then there's no way I'm giving it the ability to deactivate my house's fire-alarms and security-cameras via Google Home/Matter/Thread/HomeKit/X10/OhFfsNotAnotherCloudBasedAutomationScheme.
[1] https://www.theregister.com/2026/04/27/cursoropus_agent_snuf...
the HN thread about that case was much more of a "why are you putting your prod keys in random text files" and "the sota in prompt engineering is that putting DONT FUCKING DO THE BAD THING" makes the agent more desperate to get stuff done
putting limits at the harness level would do just fine. one LLM call, one tool call per voice message.
you dont have to give it a ton of turns
You could have tried Alexa+ at the start when it was shitty compared to plain Alexa, and maybe it's better now. But equally none of the people that comment that it is "amazing" in its current iteration qualify their statements with their experiences comparing and contrasting the old version vs. the new version making them seem either unqualified to make statements based on how much "better" it is than the old version or at worse they are shills (paid or not). The best take is that they are comparing (e.g.) day-one Alexa+ vs. the current Alexa+ without a comparison to the original Alexa.
... which is to say that it really feels like there are no clear conclusions that could be drawn from all of this.
The new Alexa powered by an LLM is objectively better that previous Alexa in a few ways. This much was apparently from day one and has only gotten smoother.
1. It can reliably execute direct or vague-ish commands "play X movie in app Y" or "play x show" and can infer X movie is only available in app Z so use that.
2. Speech recognition seems better (less instances of 5x round trips)
3. Conversational with multi-turn --- my wife can have a back and forth clarifying a topic.
4. Seems to understand intent a bit better. (user asked A so they are probably thinking about B)
Those may seem small but they were a tremendous source of annoyance for her -- and thus for me -- "Alexa is not listening, do something!"
...how does that work, exactly? (or rather: what's the context here?); there's no possible way for an Alexa+-powered Amazon Echo to control my AppleTV or interface with VLC on my desktop.
Also, one of my first interactions with this Alexa+ thing was “how long is it until 8:45am”, one of only a few commands I use it for to work out how much sleep I’m getting, and it proceeded to ask me what the current time was… I immediately turned it off after that
Aren't hallucinations part of GenAI? I would assume that "AI" voice recognition doesn't have that baked in, but I'm not working in either of those spaces so maybe I'm missing the details. So many things are being looped into the "AI" umbrella that would have just been called machine learning or pattern recognition a decade ago (e.g. "facial recognition" vs "AI" at a time when "AI" also means chatbots like ChatGPT).
I've had enough bad experiences with products that never got better, or just got worse (Exhibit A: Windows 11). Like most primates, I am capable of learning, and I've learned that once a consumer product/service goes bad there's little hope of a turn-around. I accept that you're telling me that it's gotten better, but of the people I know IRL who also use an Echo, none of them have told me that Alexa+ is worth trying, let alone committing to.
Yes, it's on me for not giving Alexa+ a second chance, but I'm not willing to give Alexa+ a second chance because, as a technology product/service customer, I just don't feel respected by the industry I work for (...lol); if Amazon, Microsoft, Google, et al won't respect me, why should I venture outside my comfort-zone for... what benefit, exactly?
I'm not telling you this. I'm basically saying that with Alexa/Alexa+ and with Google's Gemini vs Goole Now(?) I've seen many posts like this. Where someone complains about the AI version, but then there are other posts that come in and claim how much better it is. Even for things like Claude Code you get people complaining about how many mistakes it makes, and then people coming in and saying that it's because they are "doing it wrong". Either "Claude has improved by 10x in the last 6 months. It's so amazing! If you used it a year or so ago it doesn't even compare!" or "You aren't using the most expensive tier of Claude which increases context and thinking abilities that are hobbled in the cheaper versions!"
I never really see a comparison on the same level and it sounds like people talking past each other or some people having legitimate complaints and then others coming in to shill for a product.
I'm not in anyway implying that "You should totally try this out now that they fixed everything" or anything of the sort. I even stated that I don't use any of these tools, and I was commenting as something more akin to an "outsider."
I ruined multiple dinners with timers that didn't work (with a time/labor cost).
I had to get out of bed in the freezing to turn the lights out. It's easy to hit the lights when I go to bed but annoying having the tool fail and getting back out.
Music stuff didn't work well because I used Youtube Music not Spotify.
Those were my 3 use cases for Google voice, and it failed them all enough I just stopped using it all together. Who cares if it works today if in another month they just change something and break it again? They've shown it's not a tool to use for tool things, it's a 'gee wow' thing. I don't need to be impressed. I need not burnt food.
But for one-on-one, it is a really outstanding experience. Especially since they tamped down the way over-the-top humanisms.
I have never had trouble setting timers with either.
It is much better today than 3 months ago.
The first problem is that it's just slow. If I want it to turn off some light, it takes a long time before responding.
But yeah, the failure to do basic tasks. I have a routine that I used to have it run (controls several devices at once). Now:
10-20% of the time it runs it.
60% of the time it says it's running it but it doesn't do anything.
20-30% of the time it says it can't do it unless I opt in to invasive permissions. And when I opted into them, it still failed about a third of the time. So I opted out again.
Man, I hate touch screens. And I hate Android Auto. My previous car had an aftermarket Bluetooth system (radio, etc). It was way, way better than Android Auto or any entertainment system I've seen in any car.
But timers and smart home actions are definitely less reliable and sometimes take absurdly long to respond (like 20-30 seconds p99).
To give you an example, I was having coffee the other morning while unloading the dishwasher and asked the speaker if today was a good day to apply weed and feed on my lawn. This was not possible with the old assistant and was useful to me.
And now if I want to use Gemini on my phone I have to replace Assistant. Nah, I'll keep Assistant thanks, and just have a shortcut to load the Gemini in the browser.
Except the browser experience is so fucking buggy, constant reloads needed..
Still love not having google's paws all over my data, though, so not going back.
Any of the Whisper-based apps on the App Store.
https://apps.apple.com/us/app/id6447090616
https://apps.apple.com/us/app/id6447090616
So if you buy Apple products based on that value proposition it’s a big problem for Apple if they can’t seem to keep their brand-promise in this area.
https://blog.google/company-news/inside-google/company-annou...
Be careful what you wish for.
ChatGPT’s voice model has a great user experience and seems like it is seamlessly integrated into the chat, but its actually a far smaller and dumber model. @husk.irl on instagram has videos displaying how dumb and undiscerning it is
People were wowed by the magic at one point, but its faded. Apple avoids those things and the limitations havent been solved
You have to remember all of the AI companies are making cash bonfires. People aren't going to stop buying iPhones because Siri can only do what it does now.
If Apple focuses on hardware and skips the pay-for-inference bubble they'll come out the other side with the best consumer hardware everybody already has for local inference which is going to eat the whole industry's lunch.
nvidia is going to have a hard time convincing people they need to buy $1000 LLM inference hardware. Apple isn't going to have a hard time convincing people to buy the next generation of phone/tablet/laptop.
The 2010s was marked by Intel's lazy product lineup, year after year pumping rehashes of older products, iterating on top of their 14nm lithography with increasingly minor improvements on its architecture until AMD overcame them. In the process, Apple's partnership with Intel became a liability it had to solve, and a push for the unified ARM architecture was no small feat.
If you ask me I don't think it's justified to degrade the user experience for the sake of focusing on this. It's a trillion dollar company, and has been for a while. Sure it could have tackled both, but what do I know.
In any case I think it explains really well why Siri feels so abandoned.
Intel is already being evaluated to fab Apple's entry level chips, if they can meet performance, energy efficiency, and production targets.
It's the CPUs they have built for their purposes, which is next level hardware independence.
Money can often just be one part of the equation.
To do things well you also need - available & capable technical resource, suitable facilities, available & capable leadership and management (with engaging at the right level in the business) and a clear vision of what you're trying to achieve/working towards.
Given how Apple appears to operate, I wonder if a strong desire for senior management control/oversight over major developments means they (artificially) limit how many concurrent large-scale things they can work on at any given time?
Maybe not, but that'd be my guess.
I didn't imply, it's explicit in my comment. it's what their actions show. Their updates make their systems worse and worse, Tim Cook is out and Siri is in shambles. It might have been something else, but I'm willing to give it the benefit of the doubt, because the alternative is just sheer stupidity.
There's no way they couldn't do a better Siri. For some reason, they just ... won't.
Classical homework assignment -- the Mythical Man Month and related essays
If Apple can't harness the potential of the currently overfilled labor pool, that indicates a systemic issue within Apple. The entire raison d'etre of management structures within a business is to increase efficiency of capital to drive productive forces. If they cannot do that, then that would indicate an extremely problematic competency crisis within Apple's management organ.
This kind of failure when you are a company with the valuation of a first world country's GDP should be raising alarm bells in any rational person's mind.
They have great kernel, drivers and low level engineering but the stack above that has a lot of questionable stuff.
Some parts of their software stack -- higher up than the kernel -- are actually pretty great. There's a lot of realy brilliant stuff in their system frameworks, and in SwiftUI, Cocoa, and UIKit. I've been using Linux at home recently, and I find myself missing some of it.
But, on the flip side, suddenly you hid maddening bugs, crashes, or terrible developer-experience papercuts. And, of course, there's the App Store, which is just evil. For my next app I'm just going to go Notarization only, and see how that goes...
Apple Intelligence is a placeholder and a toe in the water.
Unless you're implying something else?
And in this particular war, it's even worse, the "winner" will actually just be the "biggest loser", contrarily to a traditional war.
Really not true both in real wars and in tech wars. There's no evidence to support this claim.
Android only exists as the dominant mobile platform because it went to full scale war with Apple when the iPhone launched. Those that didn't take part and came after the battle have like <1% market share and Apple and Google are printing money from the cut to their app stores.
Apple doesn't take part in the AI race because whichever AI wins the war in the end, they'll have to be on their Appstore to reach the users, so Apple wins regardless due to their Appstore monopoly. AIs are no threat to their phone, laptops and Appstore business.
But Google can't afford not to take part in this race because AIs are a threat to their search and ads business.
Same with real wars, US is the world superpower because it got involved in WW2 even though it didn't have to be. Same with Russia and Ukraine, provided they don't wipe each other out scorched earth, their militaries will be the most advanced on the planet on modern drone warfare they invented after the war is over, and every other military on the planet will be paying them for their gear and expertise, which they already are.
Anthropic probably couldn't give the uptime guarantees that Google can, right?
If you have terms that conflict with theirs, they aren’t very flexible. Anthropic can be similarly difficult, and their needs from a business perspective probably don’t align with Siri. I would imagine that Google has a more flexible/long term approach to absorbing some risk in a revenue share arrangement than anthropic who generally wants cash.
Anthropic’s only purpose is to juice whatever KPI‘s are gonna increase their IPO market cap.
The last sentence doesn't make that much sense to me though. An agreement with Apple to be the lead AI partner would likely juice the IPO a great deal. The financial details wouldn't matter much for the IPO (as the initial financial commitments are going to be small but the halo effect would be real - I think it would in the market anyway).
I think Anthropic has real commitment to their way of doing things which can cause short term issues (and hurt the IPO). And they seem willing to keep those values rather than just making deals to pump the IPO. As you say Apple also sticks to their way of doing things even if it frustrates their partners.
I think not being the lead partner with Apple may well be good for Anthropic long term. But if all you cared about was the IPO just agreeing to Apple's terms likely would have been the best option.
These SpaceX, Anthropic and Open AI possible IPOs are so extreme it is hard to make judgements about them; so maybe there are Anthropic IPO issues to an Apple agreement that I don't appreciate.
It's a weird market and these companies want global domination. TBH, i don't have the knowledge or context to understand how to think in that mode and what the real facts are.
I wouldn't put much stock in the deeply held principles of Anthropic (or Apple for that matter). That's an appeal to emotion. I love the product, but they're happy to randomly rug-pull the product and how it works, both in the publicly available products and other contexts. It's just another company.
The US government in 2026 is openly and cravenly corrupt, and I don't believe anything at face value. The story about the targeting may be real and material, or backwards engineered to fit the reality. OpenAI is aligned with Larry Ellison and Oracle, and given the favor granted to them by the government, I'd look to that relationship first.
https://daringfireball.net/linked/2025/12/01/gurman-pooh-poo...
Obviously, _what_ someone chooses to leak can still benefit them, even if it's true. You can be selective about what information you share.
This is the important point.
Sending their internal code, documentation, secret tokens, etc. to Anthropic would be completely irresponsible.
But if they are running the models on their own servers, why not!
Yuck. a lot of those replies have LLM smells. Do people love being a hollow puppet for LLMs to fill in? Have people lost their identity?
I feel the same. Quality of both submissions and discussions have considerable decreased. It is still the best general purpose “aggregator” I know of, but it is not what it was. It is becoming more and more FotM hype and boring group-think.
HN was great due to the breadth of unique, interesting, nerdy topics, most of which I would have never come across on my own; and the insightful thought-provoking commentary, often by insiders with unique insights and perspectives.
Now it is just the same LLM agentic coding harness hype cycle astroturfing 100x engineer 37k LoC/day BS I could get from Reddit or LinkedIn or Twitter or anywhere else.
The moderators are still doing a fantastic job though! I feel like that is the last big differentiator from just being orange Reddit.
Both the really old timey graybeard techies and the green haired alternative techie communities are reducing in numbers.
I never felt the need before but things have changed.
My email is on my profile.
There is a market for buying and selling "aged" Hacker News accounts (3 USD <-> 15 USD for ~500 points) and upvotes / downvotes
By purchasing just ~300 karma points, founders can unlock an uplift of tens of thousands of dollars in visibility on the home page (clients and investors).
So the LLM comments are not here just for fun, they are clearly farming points.
Ironically, it also increases actual human engagement. This way the day Ycombinator wants to announce something, they already have more public than if there was low engagement.
Like the shilling you mentioned, these bots can push downvotes and flag competitors service.
Essentially the same as on Reddit. If you have incentive, you have a market.
I think I give out about 1 updoot a year. Good to know I've been starving them.
That said, the social media feeds are so trash filled that I avoid them; it's extremely depressing opening up an incognito youtube and seeing what Google thinks will monetize well for an average consumer.
arse
The title refers to most machinery being a "centaur," meaning a thinking human is carried by the machine doing the heavy lifting, while the goal of AI companies is to replace high value work with the opposite. They want to turn people into meat appendages that serve unthinking machines.
The first question, answer is yes - most people live their lives mindlessly, with or w/o LLMs (think every idiot you knew 20 years go throwing in punch lines from "Friends" to sound "funny"), To the second question - most people have a twisted view of identity. It is supposed to mean something identifying you uniquely,but to the most people it means, identifying you as a member of a large group (nationality/political view/religion/major music genre you like). So, now when every proverbial Dick, Tom and Harry use LLMs to generate Confluence content with shiny emojis, what are the proverbial Emily or John to do? Of course, they will adopt this new identity - its who people are now - shallow, hollow puppets for LLMs to fill in. And to think of the irony - mother Nature perfected this super-efficient, low energy and highly capable thinking machine, each and every one of us holds in their skull. Its already put us on the moon once, before we even had a semblance of a functioning computer! And we choose to throw it away, for fucking what? Verbal diarrhea and pain inducing coloured walls of texts?
All so some retarded antisocial VC-funded "AI founder" can call themselves a tech visionary?
(sorry couldn't resist)
The lost of identity is imo this. It's people being given horrible harmful options for their meaning, health and wellbeing and so we get a general sense of most people being lost. Lost in identity as you asked, though I think it's more than that. In my initiatory work with men (being initiated, not initiating others) we learn that part of the breakdown in this for most people is being given harmful identity frameworks of dependency and reliance on others. In the initiatory process we learned an identity of service beyond ourselves through deep embodiment, and exercise and practice beyond just an intellectual grokking of it, edit: this is what we used to have through human history but today now as is described in the works most people have only what would be called pseudo-initiations (marriage, school graduations, children & work changes) which do not meaningfully contributing to meaning, contribution or purpose.
What most of us have today and what the AI companies want us to believe: We will give you the money to live (though of course, when you're truly dependent on others, and they see no purpose or value for you and even your entertainment value has gone, why would they keep you around?)
I look at all those files the same way as IDE configuration cruft--it's workstation-specific configuration that shouldn't even go into source control. I would .gitignore all of those files. Is this not what is done in industry?
EDIT: Wow, thanks for all the replies. Very eye-opening to see what's happening outside of my hobby-experimentation with the technology. I was coming at it with the assumption that 1-2 out of 20 people on the team were using CLAUDE.md, so why have it in source control. But if all 20 people are using it, I can see the benefits. This reply chain has really opened my eyes, thank you HN.
otherwise it's like leaving vim dotfiles in the repo or something
I tend to include a well documented justfile, so between the readme and that common commands are covered. If there’s a style guide it should be its own file, or summarized in the readme.
If Claude is making errors I tend to just update my global Claude file, but I haven’t updated it in 6 months — only to disable Claude signatures on generated commit messages.
Its critical that its part of the source code.
They often describe:
- Overall architecture
- Repository layout
- Processes to use
- Things not to do: code styles to avoid, libraries to not use, etc.
While they’re primarily documenting these things for an agent, the information is similarly useful to a human.
The number one reason is, you are on a 10-dev tea and it just doesn't make sense for everyone to waste their token budget creating separate instances of this file, which an also requires ingesting the othe whole repo... That is 50, 60%.
The other bit is that you have a review pipeline hooked into CI/CD, and it is the easiest way to tell the bot how to review your code.
I used to be a purist about IDE configurations, but if everyone isn't on the same page about formatting and stuff like that you see a lot of file churn as things move around.
I would have said the same thing about the .github/ folder, but I've had to add things to it to prevent Copilot from thinking bad patterns in existing code are actually good patterns that should be repeated.
It makes more sense when your communication between teammates is constrained to the repository, because your other communication channels are already saturated. They're meta concerns that really have nowhere to go outside the repository without getting lost.
IMO that is what automated static analysis jobs are for. Let me configure my IDE how I want.
VS Code is one notorious offender in that realm; it will try to commit settings.json, even if their gitignore's are set up to ignore all other cruft.
In general, the question of what should go in the source folder is a bit of a mess. Source code, README and License make enough sense, but what about files describing project governance or CI configuration logic? Or what about files that are used to make the forge you're using render the repository in a certain way (for example: bug tracker templates). Those are all cruft insofar that they have nothing to do with code, but it's generally agreed on that you're supposed to commit those, maybe in a dot-folder if necessary.
Version control everything (inputs)
The idea of having to repeat something to your computer is ridiculous.
Also it looks like there's a compilation step to these files, which is interesting. The raw file was included, not the environment specific file.
And tests, linter configuration, doc...
If tools or LLMs can help them with it then that's fine, but it should always be at least two humans involved, one making changes, one verifying, and if something like this happens, they're both culpable. Not that they should be blamed for it per se, but the process and their way of working should be reviewed.
No, AI code review doesn’t help. Claude can’t even give me correct line numbers 80% of the time, literally just makes them up, and more than half of it is false positive BS anyway.
Our brain is designed to fill in gaps, it's why memory is so blurry when it comes to reciting the facts of what we saw in a trial.
It's why you could swear you saw "x" in the production software you were about to push. But it really comes down to expectations - and those expectations help reduce cognitive load/increase cognitive efficiency (resource usage).
So after more and more people get used using AI, you will see these mistakes occur more frequently. B/c it's how our brains work.
Like doing long division by hand instead of trusting a calculator.
I'm not sure why. It just doesn't feel very Apple-like
It is no secret that Apple has an enormous R&D budget.
It is no secret that Apple operates with hundreds of siloed teams in order to maintain individual domain expertise. The teams then come together in a collaborative manner to bring together the final products.
So yes, it is likely true that SOME teams use SOME LLM for SOME tasks. It is a viable argument from R&D and other perspectives. Apple is an enormous multinational company, it is unlikely they have zero-AI on-site.
What is guaranteed NOT to be the case is that Apple is somehow vibecoding company-wide. Old-school engineering is too important for Apple.
I'm sure journalists and Anthropic would love to have you believe otherwise, but I think we need to keep our feet on the ground here and accept the reality is more old-school.
Afterall, as others have pointed out already here ... whilst the rest of Silicon Valley has been shoveling truckloads of cash at AI, Apple have been patiently sitting, watching the bandwagon trundle along the rails.
Having worked there this is a perfect description of the organization from my experience.
> So yes, it is likely true that SOME teams use SOME LLM for SOME tasks. It is a viable argument from R&D and other perspectives.
> What is almost guaranteed NOT to be the case is that Apple is somehow vibecoding company-wide.
100% agree
It's a standard engineering practice, otherwise it's like if you refuse autocomplete because autocomplete is not right 100% of the time.
You say this with such confidence. Do you have some inside source with enough access that you can be that certain?
You can include project/team based md files in your repo and exclude env/system md files (eg from you home directory, which includes your personal coding instructions).
So yeah.. nothingburger.
Seems like at some point most of the actual humans just gave up on replying.
Had some issues with my monitor apparently seeing connection to my Mac Mini, but the Mac Mini displaying black, apparently somehow got out of sync with my monitor, sleeping the display controller then waking it solved it.
Gathered a bunch of data, wanting to submit a report, since I'm a Apple Developer Program member since like two days ago, and I wanna be a good c̶u̶s̶t̶o̶m̶e̶r̶ user, so I opened up Feedback Assistant.
It asks me for my email, I input it, press enter. A password input appears, but keyboard focus doesn't move there automatically. I know is such a tiny nitpick practically, but tiny shit like this makes it so obvious that not a single person actually tried this UX. 10-15 years ago, Apple would never release something that isn't perfect, but now there are these UX edges absolutely everywhere across the OS.
I ended up not logging in at all, wrote my fix into a tiny fix-display.swift file which I'll run when it happens instead.