Small Programs and Languages

(ratfactor.com)

121 points | by todsacerdoti 246 days ago

19 comments

tromp 246 days ago
> Kolmogorov complexity (wikipedia.org) is: "… the length of a shortest computer program (in a predetermined programming language) that produces the object as output."
> Small programs are interesting. But I’m also interested in small programming languages. Generally speaking, the smaller the language, the less expressive it is.
It was the study of Kolmogorov complexity that motivated me to come up with the smallest and most expressive language for writing minimal size programs, and define the various flavors of Kolomogorov complexity in terms of it [1].
[1] https://gist.github.com/tromp/86b3184f852f65bfb814e3ab0987d8...
[-]
- js8 245 days ago
  Hi, I like your work. Do you think there could be a universal combinator (like iota) and some binary encoding that would produce even smaller interpreters than BLC?
  I read your paper and you show that a simple encoding of S and K combinators seem to have longer interpreters. However..
  What I don't fully understand, the complexity measure seems to depend on the binary encoding. So in theory, you could have a primitive combinator such that the interpreter is really short to express but the combinators for pair, true and false are rather long (when expressed in that primitive), thus hiding complexity in them.
  Makes me wonder, if you should only measure complexity of the interpreter combinator, and not also complexity of the "quoting" combinator, which would transform already existing encoded combinator into doubly encoded one, forcing you to represent pair, true and false combinators as well.
  [-]
  - tromp 245 days ago
    > Do you think there could be a universal combinator (like iota) and some binary encoding that would produce even smaller interpreters than BLC?
    No; I don't think so. A iota based code (1 for application, and 0 for iota) loses out badly to the SK based code (1 for application, 00 for K, 01 for S). It would instead be better to pick slightly larger bases, picking up some elements like I,B,C, or W.
    > So in theory, you could have a primitive combinator such that the interpreter is really short to express but the combinators for pair, true and false are rather long (when expressed in that primitive), thus hiding complexity in them.
    Of course, we're not just optimizing for self-interpreter size, but for overall size across a large set of tasks, such as the important theorems of AIT, or such as the milestones in a busy beaver (surpassing Graham's Number / TREE(3) / Loader's Number [1]).
    The terms you mention, namely pair, true and false are not tasks, but the tiniest of data structures that you commonly use in programming tasks. Optimizing just for those is not as effective as for larger tasks that need to balance lots of data types and control structures.
    [1] https://oeis.org/A333479
- veqq 245 days ago
  Here's an attempt to implement it: https://github.com/aartaka/cl-blc
  [-]
  - tromp 245 days ago
    Thanks! Implementations of BLC in many other languages can be found at [1].
    [1] https://rosettacode.org/wiki/Universal_Lambda_Machine
karmakaze 246 days ago
Fun read. I grew up with small everything by default writing machine/assembly on 8-bits.
One thing about small capable languages is that they're sort of proto-languages where it's used to create a specific flavour/dialect with programming abstractions which are then used to implement a program. This is true of all languages but especially the tiny ones like assembly language or Lisp. This is the reason I believe that Lisps never got more popular because each project is it's own dialect and not applicable elsewhere. It's why Clojure a larger more opinionated language with 'batteries included' is more useful elsewhere.
forinti 246 days ago
That's what made computer magazines so much fun in the 1980s.
Every month there would be small projects (fractals, colliding galaxies, small games) that would teach you a new subject and be small enough that you could type in.
[-]
- colanderman 245 days ago
  Also the Beagle Bros software and books for the Apple // series. They even had a semi regular "contest" where readers would submit "two-liners" which they'd then judge and distribute via floppy. That culture grew and cemented my love of the hobby, I loved and miss it.
  [-]
  - WillAdams 245 days ago
    Thank you for mentioning Beagle Bros.! (that's a name I haven't heard in a long while)
    To save folks a search:
    https://beagle.applearchives.com/
- cmontella 245 days ago
  Haha yeah! When I was a kid before we could afford a computer, I would type magazine programs out on a type writer. I remember that I had to keep starting over because I would make a mistake, but that was okay because the program was short enough I was confident I could get through without any mistakes eventually.
antilisp 245 days ago
(Author of Antilisp, a Lisp for metaprogramming in non-Lisp languages) Not as small a project, but writing a feature-complete implementation of a small language is a very good learning experience that I would recommend to everyone.
There are only so many language implementers, and many opportunities to innovate in language design become obvious when it's your turn to decide what to implement.
For example, I made macros and axioms first-class citizens of the language as special cases of the function type, because using an interpreter allowed me to decide how things should be evaluated at runtime. I accidentally found a trivial way to bootstrap (quote) in my language. I also found a way to make a much more general reader macro system by simply writing a modular value reader that can be extended by adding new elements to a list of parsers. This make non-lisp syntaxes like JSON much easier to implement than with the CL reader macro system where reader macros are bound to some special parameters.
Note for context: In my case, the Lisp interpreter is under 3KLoC, and most of the abstractions are bootstrapped from macros and reflection primitives, or primitive that are shadowed by higher level builtins.
[-]
- J_McQuade 245 days ago
  Off topic, but this is the first I've heard of Antilisp and I love the idea - seems aimed at the sort of problems that I've definitely solved with a tangle of unmaintainable elisp before. Now I just need to forget about the whole thing before I talk myself into writing a Helm module for it... or an HCL one... or... NO.
  [-]
  - antilisp 245 days ago
    Thanks! Antilisp is still WIP, so I would not recommend spending time writing modules for complex languages yet. I am (slowly) fixing things and writing more/better wrappers so that users don't have to dedicate their own time before deriving value from the language.
  - lioeters 245 days ago
    I relate to that feeling of having to resist the siren call of Lisp.
anthk 246 days ago
https://t3x.org has Lisps, a Forth, a Pascal like and a few more. Klong it's close to J/K.
On TCL, jimtcl it's small but it can be really powerful to build CLI/TUI tools. It has TLS support and, of course, Sqlite3 support.
dunham 245 days ago
Meta II is also an interesting small language for writing compilers (including itself).
wikipedia: https://en.wikipedia.org/wiki/META_II
paper: https://hcs64.com/files/pd1-3-schorre.pdf
[-]
- xkriva11 245 days ago
  I recommend this Lua implementation. Truly beautiful and more accessible. https://github.com/melvinzhang/meta2-lua
asplake 246 days ago
Something I was looking for recently: is there a tiny, "just for fun" language in the ML line, or something similarly based on typed lambda calculus? Or something smaller, something more Lisp-like but curried?
[-]
- chubot 246 days ago
  There are some mini typed functional languages here - https://plzoo.andrej.com/
  Although they are implemented in OCaml, which feels like more magic than say a Forth or Lisp written in C. In C you can still "see" how it runs on hardware
  I think Standard ML is actually the "usable mini ML"
  These types of languages will always be bigger than Lisp/Forth/Tcl, because they require lexer /parser / type checker / GC runtime
  ---
  Although I would like to see someone do the smallest possible C version of an ML
  Something like c4 - C in 4 Functions (https://github.com/rswier/c4) - which actually has a type checker! (which is hard to see on first reading)
  [-]
  - asplake 245 days ago
    That’s great, thanks! I see there is a mini Haskell there too. But yes, a smallest possible ML is what I had in mind.
- jinwoo68 245 days ago
  Gleam[1] is influenced a lot by ML and is a very simple language. But it's not a "just for fun" language. I like it a lot.
  [1] https://gleam.run/
  [-]
  - asplake 245 days ago
    Yes I’ve played with Gleam. A nice small language but bigger than I had in mind. That’s mainly to look under the covers, which in Gleam’s case involves other language platforms, so less interesting in this context.
- taolson 245 days ago
  I don't know if these match your definition of "tiny", but there's Miranda:
  https://www.cs.kent.ac.uk/people/staff/dat/miranda/
  and I wrote a pure, lazy, functional language and self-hosted compiler based upon Miranda:
  https://github.com/taolson/Admiran
  [-]
  - asplake 245 days ago
    Nice!
- kryptiskt 245 days ago
  SASL (the predecessor to Miranda) isn't statically typed, but it's very small and very much feels like a language in the ML tradition.
- anthk 245 days ago
  Nils from T3X has a micro ML language too.
  http://t3x.org/mlite/index.html
  [-]
  - asplake 245 days ago
    That’s the closest so far, thanks!
bradain 245 days ago
I really dislike the tiny lisp implementation mentioned in the article. Part of the beauty of a real "lisp in lisp" interpreter is that it is meta-circular -- that is, the interpreter is able to interpret itself.
Using pmatch and some other tricks makes the interpreter shorter but prevents meta-circular evaluation, in which case you could have just defined the lisp interpreter as just being eval.
[-]
- rurban 244 days ago
  Exactly. There needs to be cond instead of some pmatch. And the definition of the required 7 primitives for apply, the base for eval.
xelxebar 245 days ago
I love these explorations! Thanks for sharing.
> Hui can list the entire interpreter in the appendix because it’s just that single page, 122 lines of incredibly cryptic C, which you can view here:
I count 41 lines, but more interestingly and by pure luck, I just happened to make a post about this less than an hour ago:
https://blog.wilsonb.com/posts/2025-06-06-readable-code-is-u...
agentultra 246 days ago
Smol coding and size coding are fun. If you like this sort of stuff check out the demo scene, lovebyte, etc.
I got nerd sniped by a conversation with a colleague and am, in my spare time, working on the smallest event store database I can write.
It’s a nice way to take a break from the enterprise software world.
zvr 243 days ago
I as really happy to see Tcl mentioned alongside Forth and Lisp.
It's, in my opinion, a wonderful "small language", clearly defined by the 12 rules in the Dodekalogue.
muth02446 245 days ago
Another small language is Cwerg [1] (tutorial: [2]) A C-like low level language with a Python inspired syntax.
The frontend has a self imposed line budget of 10kLOC.
[1] http://cwerg.org [2] https://github.com/robertmuth/Cwerg/blob/master/FE/Docs/tuto...
PaulHoule 246 days ago
FORTH, Lisp and Tcl are all languages that you can implement without understanding how parsers or code generators work. For most other languages you need the dragon book.
[-]
- anthk 246 days ago
  I disagree, with Lisp you have metaprogramming exercises under every book, be CACS, SICP or the ones for S9 (s9fes Scheme) from Nils.
  That for Scheme; under Common Lisp loop it's hell and, well, on PAIP you implement a Scheme Lisp interpreter.
  Heck, on Scheme you need to understand eval and apply and how do they operate recursively.
  More? Both Scheme and CL have books exercises on implementing a mini-Prolog.
  On Forth: immediate. You have a raw Forth core, implement do...loop and if, then, else. Have fun.
  [-]
  - PaulHoule 245 days ago
    Really? The Lisp reader is much simpler than the parser of a language like C. You can get away with basically a tree-walking interpreter though every time I write something like that I have to nail down details involving quote, eval and all that.
    I really enjoyed the beginning of Graham's On Lisp but walked away disappointed from the end in that he avoided any really difficult metaprogramming problems, most of the time when he's using macros it is for performance, usually he has an example that uses plain function pointers. Except for the continuation example you can work his examples in Python [1] including the query language and the mini-Prolog [2], but Python already has yield, async, await, etc.
    If you like S-expressions you can go to town building out classes full of static methods in Java that let you write things like
```
   Variable<Integer> v = variable("v");
   Expression<Integer> x = add(literal(75),v));
   v.set(110);
   eval(x);
```
    Sure Lisp doesn't make you write all the functions and classes to represent that, but the Java version is typed and can make the autocomplete in your IDE sing [3]
    Forth is something special. The idea that words have a compile-time and run-time meaning is brilliant and lets you make a programming language without a lot of CS knowledge. I wrote one in high school for my TRS-80 Color Computer Computer.
    I'd say though the metaprogramming techniques in On Lisp and other Lisp books are about the continental shelf but the dragon book is about the ocean.
    [1] It's not quite the same as the mini-CLOS he works but I've developed so many metaobject facilities for Python using metaclasses, annotations, magic methods, etc.
    [2] See the Jena rules engine for a really beautiful (would be pedagogical if it had a few more comments) implementation in Java of something Prolog-adjacent which is capable of both forwards and backwards chaining. The genius of Prolog is that it's simple to implement and with WAM faster than you would think possible even if it looks like it fell off a UFO. Try to write a lot of it though and the various methods for doing imperative programming in it look less clever than awkward, so the interesting developments are pure like Datalog.
    [3] With some annoyance thanks to type erasure, you can't override a function to take an Expression<String> or an Expression<Int>, you have to unerase the types by putting them in the function names or something like that.
    [-]
    - anthk 244 days ago
      Forth it's easy, but Lisp the core implementation can be easy, yes, but you need to implement Lisp in Lisp, similar to Forth under itself.
      Heck, I tested for...next from Eforth under PFE Forth with Forth words, not C. Forth may be the only good case for an easy syntax.
      I mean, the Lisp syntax it's dumb easy too but you need to develop the rest of features. Is not just about lists, car and cdr. Altough if you get eval/apply the rest writes itself, such as map and proc.
elia_42 245 days ago
Really interesting reflection on small languages, the relationship between simplicity and expressiveness. Also interesting is the broadening of the theme with the beautiful final quotations of a philosophical nature as well, about the importance of the habits of those who like to enclose themselves and build their own little world that does not exclude the real world, since it depends, for them, on their own little world.
norir 245 days ago
I find the notion of kolmogorov complexity both interesting and potentially a bit misleading. The reason it is misleading is because of the extra implicit context that is not present in the artifact but is necessary for interpretation. Sure, it is possible to write a ray tracer on a business card, but you can't write the c compiler or runtime on a business card so I don't agree that it is a business card sized concept when stripped to its essence.
Now, you could take the Guy Steele "Growing a language route" and try to define a self-consistent program that defines every word that is used, but even then you will have to rely on some axioms that cannot be defined solely within the artifact. So then the question becomes what is the size of the axioms?
In other words, Kolmogorov complexity does point to a real thing -- and I strongly agree that smaller generally corresponds to lower complexity given the same set of axioms -- but the deeper sense of complexity seems inherently unmeasurable to me.
Or put yet another way, complexity can only measured relationally.
[-]
- tromp 245 days ago
  When Kolmogorov complexity is expressed directly in terms of the pure untyped lambda calculus, the oldest and simplest model of computation (except perhaps for SK combinatory logic, which is less expressive bit-for-bit), you cannot really complain of hidden complexity in the axioms. Even numbers have to be built from scratch.
- LegionMammal978 245 days ago
  Yeah, examples of 'small' interpreters (including 'small' self-interpreters) have always made me uneasy when they depend on a powerful runtime environment. E.g., in the limit, you can create a language where the 0-byte program is a self-interpreter, and all other programs are interpreted as Python or whatever. You can then point to "look how small the description is!" when it's really just sleight of hand. To put it as a hot take, bootstrapping is very overrated.
  Of course, every human definition will similarly hide complexity to some extent, so there's no real way to win.
nicolaskruchten 245 days ago
I tried to quantify some of these smallness vs expressivity tradeoffs in visualization libraries here, with some interesting conclusions IMO: https://notascope.io
codr7 245 days ago
I've been working on a substrate/bootstrap for small languages in multiple host languages lately:
https://github.com/codr7/shi
mike_ivanov 245 days ago
No mention of Smalltalk?
WillAdams 246 days ago
Surprised Oberon doesn't make this discussion.