All remote AI are a massive security risk for individuals/companies/governments that may be targeted by the US government.
It is likely that the US will get a live feed from each AI provider that they are inspecting in real time to identity things of interest, terrorist attacks or foreign government planning or even foreign companies competitive to key US companies.
It will give them access to the though process in those companies as well as much of their text-based IP (source code, docs, meeting transcripts, etc)
Also if you are using local AI that you didn’t train yourself you can never be sure it doesn’t have purposeful biases in its reasoning that may disadvantage you - such as directing you away from certain plans or ideas or patents etc.
1. LOL I've just downloaded literally whole internet and copyrighted books and put them through a neural network. Now I have this whole knowledge in my LLM.
2. Hey? Are you using my NN for training your NN? you're a thief!
I got curious and asked my Chinese friends, and they gave me a Reddit link[1]. It looks like it's about location data collection, and they suggested that might be the reason for the issue.
The issue here is not whether Anthropic used Common Crawl, Alibaba also does that.
The issue is that by distilling Claude, Alibaba reuses the IP anthropic used to train the model that's more akin to classical Chinese reverse engineering methods and disrespect of IP
Regardless of whether this specific claim is true, enterprises are becoming much more cautious about developer tools that can read large portions of proprietary codebases.
not to mention they are kind of capable of executing code and susceptible to injections which also amounts to being practically backdoors if youre not super careful about how u use the tooling
Wasn't one of the big promises the AI labs made "uncopyrighting"? Ie. the ability to reconstruct large works, including source code, without actual access to the source code? Everything from movies to operating systems.
Becoming? We've moved entirely in the opposite direction.
When these tools first appeared the overwhelming conversation was about the risk of letting a remote tool siphon your code and intellectual property (where eventually they're going to add that to their training). Now everyone is using them, and that fear seems to have dissolved. Every corporation is sprinkled with Claude Code, Antigravity, Copilot, Codex, and so on. Even the long fear-mongered Chinese providers are being heavily used in many spaces.
In this case this is a PR battle between two firms, and it isn't much more. And Alibaba isn't worried about the "proprietary code" (the truth is that there is incredibly little interest in most orgs code), but that the tool is a backdoor, or at least that is the claim.
> there is incredibly little interest in most orgs code
I think from a commercial perspective yes, but access to source code is very good for finding exploits which could be very valuable for governments. I could also see a future where companies are directly cyber-attacking competitors in hostile markets too...
Can't say they are wrong, after the latest backdoor, or let's say, undocumented functionality that leaks some data that was pushed in Claude Code few days ago
when i was in hongkong, chatgpt and gemini were disabled. Maybe this has changed though. When I was in China, the corporate vpn (zscaler) routed traffic through hk
It's pretty much the same as when "installing programs on your computer" is called "sideloading". Deliberately deceptive, weaponized language to make it seem like a bad thing.
i can see why they want to stop it but
1. you have to pay for the "attack"
2. these AI companies trained on copyrighted content without permission or attribution to anyone who's data was used to train.
Considering their massive distillation, if US companies stop publishing new models to the public, would China still be able to develop new open weight models?
I don't think China would strugle to scrape the internet for fresh data.
And they constantly publish state of the art LLM research (see DS4 context compaction and cache tech).
They have very capable tech giants. So while not being able to distill western models would probably have some impact, it's probably becoming lesser as time passes.
We might even see Western LLMs distilling Chinese models soon. If they aren't already to some extent.
China has most probably already achieved "escape velocity" on the software side. Now if they achieve parity, to some degree at least, on the hardware side with Nvidia it is very possible they'll overtake the US.
More than a year ago, when Anthropic and OpenAI started to gide the reasoning bits from the output, a lot of people here on HN predicted that Chinese models days were numbered.
Fast forward to today, and models such as DeepSeek and MiMo are nothing short of excellent. I haven't used GLM or Qwen but heard very good things about them as well.
This "massive distillation" sounds a lot like anxiety about how companies from outside the US can develop very good models themselves.
It is likely that the US will get a live feed from each AI provider that they are inspecting in real time to identity things of interest, terrorist attacks or foreign government planning or even foreign companies competitive to key US companies.
It will give them access to the though process in those companies as well as much of their text-based IP (source code, docs, meeting transcripts, etc)
Also if you are using local AI that you didn’t train yourself you can never be sure it doesn’t have purposeful biases in its reasoning that may disadvantage you - such as directing you away from certain plans or ideas or patents etc.
[1]https://www.reddit.com/r/ClaudeAI/comments/1ujila1/anthropic...
The timezone fetch was to alter program behaviour at runtime, not to send arbitrary timezones for tracking reasons.
It was one way of detecting if it was a chinese person using the program and then behaving differently.
Malware behaves this way. STUXNET for example was wired to do nothing except propagate unless the environment had the right conditions.
The issue is that by distilling Claude, Alibaba reuses the IP anthropic used to train the model that's more akin to classical Chinese reverse engineering methods and disrespect of IP
When these tools first appeared the overwhelming conversation was about the risk of letting a remote tool siphon your code and intellectual property (where eventually they're going to add that to their training). Now everyone is using them, and that fear seems to have dissolved. Every corporation is sprinkled with Claude Code, Antigravity, Copilot, Codex, and so on. Even the long fear-mongered Chinese providers are being heavily used in many spaces.
In this case this is a PR battle between two firms, and it isn't much more. And Alibaba isn't worried about the "proprietary code" (the truth is that there is incredibly little interest in most orgs code), but that the tool is a backdoor, or at least that is the claim.
I think from a commercial perspective yes, but access to source code is very good for finding exploits which could be very valuable for governments. I could also see a future where companies are directly cyber-attacking competitors in hostile markets too...
Until the first big incident, yes.
https://news.ycombinator.com/item?id=48759754
Workarounds aside, it says Claude Code not Claude.
i.e. they are using the CLI running any model. You can for instance run GLM with it.
Claude Code is neither and it is literally info stealing malware.
What's a "distillation attack"? How is it different from simply distillation?
Unlike the vast majority of people Anthropic stole from.
And they constantly publish state of the art LLM research (see DS4 context compaction and cache tech).
They have very capable tech giants. So while not being able to distill western models would probably have some impact, it's probably becoming lesser as time passes.
We might even see Western LLMs distilling Chinese models soon. If they aren't already to some extent.
More than a year ago, when Anthropic and OpenAI started to gide the reasoning bits from the output, a lot of people here on HN predicted that Chinese models days were numbered.
Fast forward to today, and models such as DeepSeek and MiMo are nothing short of excellent. I haven't used GLM or Qwen but heard very good things about them as well.
This "massive distillation" sounds a lot like anxiety about how companies from outside the US can develop very good models themselves.