What I Always Wanted to Know about Second Class Values

(dl.acm.org)

18 points | by todsacerdoti 3 hours ago

4 comments

kazinator 1 hour ago
I feel that this work, and that other one it references which proposes modeling first and second class values with type, are simply ignorant of the massive amount of research that had been done in this area in decades past.
I'm actually astonished you can still publish anything on this stuff: papers that talk about upward funargs and show trivial Lisp lambda code, like it's 1973.
The stack/heap dichotomy dictating second class vs first class is wrongheaded; we already have results which show that the distinction between stack and heap is artificial and phony.
For instance oh, Richard Stallman's 1980 work, the "Phantom Stacks" paper. The following two italicized paragraphs are an excerpt:
The key to reducing the impact of a stack on the complexity of the system is to realize that a stack really is not a sort of object that lives in memory, but a way of allocating memory among other objects. It need not have any existence, except in the "mind" of some part of the system which is allocating memory, for some length of time. If the data which is stored in the stack looks like ordinary Lisp objects, then most of the Lisp system will be able to deal with it properly without having to be aware that there is a stack.
One consequence of making the stack not recognizable as a stack is that the garbage collector won't know it is a stack. It will compactify, reorder or reallocate it willy nilly. But the stack's contents, valid accessible Lisp objects, will be preserved in meaning by the garbage collection, just like all such Lisp objects. So no harm is done, as long as we realize that the stack isn't a stack any more. The area of memory which used to be a stack is now part of the heap.
Chicken Scheme, famously, implements an idea from Henry Baker of bump allocating all objects on the stack. All function calls create new frames, and so do function returns: function returns are actually compiled into calls to the continuation and therefore are new function calls with new frames on the stack. When the stack reaches a certain limit, it is rolled back. All objects on it that are still reachable (including lambda environments) are moved to the heap at that moment.
So according to the present paper, objects must be transparently switching from the second class to the first class --- and that is incompatible with the idea of treating them as different types.
[-]
- xorvoid 46 minutes ago
  This makes a lot of sense. It makes me think of Go's approach to blur the distinction of heap/stack by just treating it as an escape analysis problem leading to an allocation choice. If it provably doesn't escape => optimize it by using the stack, otherwise fallback to the heap.
  The distinction of stack vs heap objects is an old distinction that is deeply encoded in the semantics of C. It's not obvious that's the right choice.
  It's worth pointing out however that you do want to have control sometimes. When you're coding for performance, etc it can be very important to control exactly where objects live (e.g. this must be on the stack with a certain layout). I feel like sometimes it's underappreciated in modern PL design that low-level coding needs this kind of control sometimes.
  I think there exists a happy medium solution ultimately though.
- skybrian 52 minutes ago
  Other languages (such as Go) try to keep objects on the stack rather than the heap to avoid putting pressure on the garbage collector, increasing performance. (This seems more important from a practical perspective than supporting continuations?)
  But Go doesn't distinguish them using the type system either. New versions of Go sometimes do better escape analysis, keeping more objects on the stack, and that doesn't break any programs.
stmw 58 minutes ago
Some may argue that the real problem here is the unstated assumption of wanting to have a garbage collector in the first place.
[-]
- Lt_Riza_Hawkeye 19 minutes ago
  Even if you use C++, and your lambdas allocate to the heap when created and deallocate when going out of scope, this paper could still help reduce the need for heap allocation for capturing lambdas in the first place, improving performance.
moring 1 hour ago
Sorry for being ignorant of the basics, but how does the following work?
> However, seg-ments of code that are provably free of garbage collection > have deterministic timing and can satisfy hard timing con-straints, as they > are certainly not interrupted by garbage collection.
You are only ever "certainly not interrupted" if you turn off interrupts, which requires a high level of privileges. And not being interrupted still does not mean you have uncontended access to main memory or shared caches, which is a relevant factor for hard real-time. Nor do you have uncontended access to execution facilities (e.g. multipliers or floating-point ALUs), but at least for those you might be able to find an upper bound even for contended access.
[-]
- wk_end 49 minutes ago
  Well, you're guaranteed not to be interrupted by garbage collection, which is the point here. Segments of code that are provably free of garbage collection can satisfy hard timing constraints, but that doesn't mean that they just will. It's a paper about garbage collection; addressing all the other things that need to be done to ensure compliance with hard real-time requirements is clearly outside the scope of the work.
  I'm speaking a little outside my wheelhouse here (experts, please jump in), but my understanding is that if you have actual hard real-time requirements, you very well might turn off interrupts, multithreading, etc. and work with uncontested access to CPU/memory resources during time-sensitive parts of your code, as needed.
- kazinator 39 minutes ago
  It's a curious sentence and one of only two to mention timing, the other being the introduction's mention of the "unpredictable timing" of garbage collection.
  Obviously it is true in a single threaded system. If the one and only thread executes a basic block of instructions, none of which call into memory management, or call any functions which might do that, then it holds that it's not interrupted by garbage collection.
  If there are threads, and a stop-the-world discipline for GC, then obviously not.
  Note that garbage collection itself can have a switch, not involving disabling interrupts. Of course, you wouldn't want to insert that around every block that doesn't need GC. I'm just mentioning that in order to note that we don't a heavy hammer like disabling interrupts in order to momentarily disable garbage collection. When GC is disabled, it is still possible to allocate objects, but no garbage collection passes are invoked to reclaim storage. This means that if the allocator has exhausted the available heaps, since it is not able to run a garbage collection pass, it has to ask the system for more memory for a new heap.
  I've used a switch like this around calls to a Yacc-generated parser, due to the Yacc stack being invisible to the garbage collector; newly allocated objects could be prematurely reclaimed as they are being "laundered" through the Yacc stack before being nailed into the abstract syntax tree.