Self-Harness: Harnesses That Improve Themselves

(arxiv.org)

64 points | by jonnonz 2 days ago

5 comments

drdeca 20 minutes ago
Was surprised and somewhat disappointed that the article doesn’t appear to evaluate how well the models work when running in the harnesses optimized for the other models. Do they still do better than with the baseline harness? Does each model do worse with a harness optimized (by this process) for the other models, than it does for the harness optimized for itself?
behnamoh 2 hours ago
What else is new? Put it in emacs and let the model improve the harness over time.
7e 1 hour ago
Pretty obvious stuff; see Terminator for the conclusion (SkyNet). Or the Matrix. We really need more work on model alignment, trustworthiness, and control.
tlarkworthy 1 hour ago
[flagged]
mncharity 1 day ago
[dead]