Knowledge Distillation of Black-Box Large Language Models (2024)

(arxiv.org)

98 points | by babelfish 11 hours ago

9 comments

phantompeace 44 minutes ago
Considering the very small difference between just SFT on the student model as compared to SFT + DPO on a proxy, doesn't it make sense to concentrate on ensuring the SFT dataset is perfect rather than sorry about DPO etc? And just train directly on the student model?
Alifatisk 10 hours ago
Why is this published again? Is this a reference to recent events?
[-]
- babelfish 9 hours ago
  I just saw some post about it on Threads and found it interesting so decided to share!
  [-]
  - tough 5 hours ago
    My best guess is this is a reference to the recent accusations from Anthropic of chinese labs ¨distilling¨ on their models
dmezzetti 9 hours ago
Well-Read Students Learn Better: On the Importance of Pre-training Compact Models
Related paper that's a good read: https://arxiv.org/abs/1908.08962
linolevan 10 hours ago
Can we note that this is a 2024 paper in the title?
TimXare 6 hours ago
[dead]
duendefm 11 hours ago
The Chinese are really going strong on destroying the American AI economy bubble. Honestly, despite the fact that I'm totally pro USA and anti China, I think we should help them crashing the American AI bubble. They are controlling everything and we can't even buy a new computer nowadays while getting no benefit from this. I wish some influential programmers stimulated coders everywhere to skip Claude and Chatgpt subscriptions for Chinese ones, at scale. If we programmers united we could help this bubble burst, I'm sure.
[-]
- anax32 1 hour ago
  The US "product machine" is so strong. They really know how to do frictionless signup and vendor lock-in on the corporate side.
- nozzlegear 10 hours ago
  > skip Claude and Chatgpt subscriptions for Chinese ones, at scale. If we programmers united we could help this bubble burst, I'm sure.
  I'm doing my part!
- cynicalsecurity 10 hours ago
  [flagged]
  [-]
  - anon373839 9 hours ago
    Dario, is that you? Is Anthropic’s next ploy to seek support via the culture wars?
  - girvo 6 hours ago
    Why would I care about Christian morals? In fact from what I can see of the US, you don’t have them either.
  - duendefm 10 hours ago
    Nvidia, Anthropic and OpenAI are controlling everything, and nothing is improving for everyone, quite the opposite. So I just hope they crash to the ground.
  - gmerc 10 hours ago
    lol Christian Morals. Epstein and his best buddy running the show tells you all about this
    [-]
    - big-and-small 9 hours ago
      What Epstein and buddies were doing was very... Christian...
      Virgin Mary was very young in the events you know.
  - LNSY 10 hours ago
    "They don't have Christian morals" -- does that mean they don't commit genocide and fuck kids? Because that sounds like a point for them
modgate 2 hours ago
test comment from modgate