> Kolmogorov complexity (wikipedia.org) is: "… the length of a shortest computer program (in a predetermined programming language) that produces the object as output."
> Small programs are interesting. But I’m also interested in small programming languages. Generally speaking, the smaller the language, the less expressive it is.
It was the study of Kolmogorov complexity that motivated me to come up with the smallest and most expressive language for writing minimal size programs, and define the various flavors of Kolomogorov complexity in terms of it [1].
Hi, I like your work. Do you think there could be a universal combinator (like iota) and some binary encoding that would produce even smaller interpreters than BLC?
I read your paper and you show that a simple encoding of S and K combinators seem to have longer interpreters. However..
What I don't fully understand, the complexity measure seems to depend on the binary encoding. So in theory, you could have a primitive combinator such that the interpreter is really short to express but the combinators for pair, true and false are rather long (when expressed in that primitive), thus hiding complexity in them.
Makes me wonder, if you should only measure complexity of the interpreter combinator, and not also complexity of the "quoting" combinator, which would transform already existing encoded combinator into doubly encoded one, forcing you to represent pair, true and false combinators as well.
> Do you think there could be a universal combinator (like iota) and some binary encoding that would produce even smaller interpreters than BLC?
No; I don't think so. A iota based code (1 for application, and 0 for iota) loses out badly to the SK based code (1 for application, 00 for K, 01 for S). It would instead be better to pick slightly larger bases, picking up some elements like I,B,C, or W.
> So in theory, you could have a primitive combinator such that the interpreter is really short to express but the combinators for pair, true and false are rather long (when expressed in that primitive), thus hiding complexity in them.
Of course, we're not just optimizing for self-interpreter size, but for overall size across a large set of tasks, such as the important theorems of AIT, or such as the milestones in a busy beaver (surpassing Graham's Number / TREE(3) / Loader's Number [1]).
The terms you mention, namely pair, true and false are not tasks, but the tiniest of data structures that you commonly use in programming tasks. Optimizing just for those is not as effective as for larger tasks that need to balance lots of data types and control structures.
Fun read. I grew up with small everything by default writing machine/assembly on 8-bits.
One thing about small capable languages is that they're sort of proto-languages where it's used to create a specific flavour/dialect with programming abstractions which are then used to implement a program. This is true of all languages but especially the tiny ones like assembly language or Lisp. This is the reason I believe that Lisps never got more popular because each project is it's own dialect and not applicable elsewhere. It's why Clojure a larger more opinionated language with 'batteries included' is more useful elsewhere.
(Author of Antilisp, a Lisp for metaprogramming in non-Lisp languages)
Not as small a project, but writing a feature-complete implementation of a small language is a very good learning experience that I would recommend to everyone.
There are only so many language implementers, and many opportunities to innovate in language design become obvious when it's your turn to decide what to implement.
For example, I made macros and axioms first-class citizens of the language as special cases of the function type, because using an interpreter allowed me to decide how things should be evaluated at runtime.
I accidentally found a trivial way to bootstrap (quote) in my language.
I also found a way to make a much more general reader macro system by simply writing a modular value reader that can be extended by adding new elements to a list of parsers. This make non-lisp syntaxes like JSON much easier to implement than with the CL reader macro system where reader macros are bound to some special parameters.
Note for context:
In my case, the Lisp interpreter is under 3KLoC, and most of the abstractions are bootstrapped from macros and reflection primitives, or primitive that are shadowed by higher level builtins.
Off topic, but this is the first I've heard of Antilisp and I love the idea - seems aimed at the sort of problems that I've definitely solved with a tangle of unmaintainable elisp before. Now I just need to forget about the whole thing before I talk myself into writing a Helm module for it... or an HCL one... or... NO.
Thanks!
Antilisp is still WIP, so I would not recommend spending time writing modules for complex languages yet.
I am (slowly) fixing things and writing more/better wrappers so that users don't have to dedicate their own time before deriving value from the language.
That's what made computer magazines so much fun in the 1980s.
Every month there would be small projects (fractals, colliding galaxies, small games) that would teach you a new subject and be small enough that you could type in.
Also the Beagle Bros software and books for the Apple // series. They even had a semi regular "contest" where readers would submit "two-liners" which they'd then judge and distribute via floppy. That culture grew and cemented my love of the hobby, I loved and miss it.
Haha yeah! When I was a kid before we could afford a computer, I would type magazine programs out on a type writer. I remember that I had to keep starting over because I would make a mistake, but that was okay because the program was short enough I was confident I could get through without any mistakes eventually.
Something I was looking for recently: is there a tiny, "just for fun" language in the ML line, or something similarly based on typed lambda calculus? Or something smaller, something more Lisp-like but curried?
Although they are implemented in OCaml, which feels like more magic than say a Forth or Lisp written in C. In C you can still "see" how it runs on hardware
I think Standard ML is actually the "usable mini ML"
These types of languages will always be bigger than Lisp/Forth/Tcl, because they require lexer /parser / type checker / GC runtime
---
Although I would like to see someone do the smallest possible C version of an ML
Something like c4 - C in 4 Functions (https://github.com/rswier/c4) - which actually has a type checker! (which is hard to see on first reading)
Yes I’ve played with Gleam. A nice small language but bigger than I had in mind. That’s mainly to look under the covers, which in Gleam’s case involves other language platforms, so less interesting in this context.
I tried to quantify some of these smallness vs expressivity tradeoffs in visualization libraries here, with some interesting conclusions IMO: https://notascope.io
I really dislike the tiny lisp implementation mentioned in the article. Part of the beauty of a real "lisp in lisp" interpreter is that it is meta-circular -- that is, the interpreter is able to interpret itself.
Using pmatch and some other tricks makes the interpreter shorter but prevents meta-circular evaluation, in which case you could have just defined the lisp interpreter as just being eval.
I find the notion of kolmogorov complexity both interesting and potentially a bit misleading. The reason it is misleading is because of the extra implicit context that is not present in the artifact but is necessary for interpretation. Sure, it is possible to write a ray tracer on a business card, but you can't write the c compiler or runtime on a business card so I don't agree that it is a business card sized concept when stripped to its essence.
Now, you could take the Guy Steele "Growing a language route" and try to define a self-consistent program that defines every word that is used, but even then you will have to rely on some axioms that cannot be defined solely within the artifact. So then the question becomes what is the size of the axioms?
In other words, Kolmogorov complexity does point to a real thing -- and I strongly agree that smaller generally corresponds to lower complexity given the same set of axioms -- but the deeper sense of complexity seems inherently unmeasurable to me.
Or put yet another way, complexity can only measured relationally.
When Kolmogorov complexity is expressed directly in terms of the pure untyped lambda calculus, the oldest and simplest model of computation (except perhaps for SK combinatory logic, which is less expressive bit-for-bit), you cannot really complain of hidden complexity in the axioms. Even numbers have to be built from scratch.
Yeah, examples of 'small' interpreters (including 'small' self-interpreters) have always made me uneasy when they depend on a powerful runtime environment. E.g., in the limit, you can create a language where the 0-byte program is a self-interpreter, and all other programs are interpreted as Python or whatever. You can then point to "look how small the description is!" when it's really just sleight of hand. To put it as a hot take, bootstrapping is very overrated.
Of course, every human definition will similarly hide complexity to some extent, so there's no real way to win.
Really interesting reflection on small languages, the relationship between simplicity and expressiveness. Also interesting is the broadening of the theme with the beautiful final quotations of a philosophical nature as well, about the importance of the habits of those who like to enclose themselves and build their own little world that does not exclude the real world, since it depends, for them, on their own little world.
FORTH, Lisp and Tcl are all languages that you can implement without understanding how parsers or code generators work. For most other languages you need the dragon book.
Really? The Lisp reader is much simpler than the parser of a language like C. You can get away with basically a tree-walking interpreter though every time I write something like that I have to nail down details involving quote, eval and all that.
I really enjoyed the beginning of Graham's On Lisp but walked away disappointed from the end in that he avoided any really difficult metaprogramming problems, most of the time when he's using macros it is for performance, usually he has an example that uses plain function pointers. Except for the continuation example you can work his examples in Python [1] including the query language and the mini-Prolog [2], but Python already has yield, async, await, etc.
If you like S-expressions you can go to town building out classes full of static methods in Java that let you write things like
Variable<Integer> v = variable("v");
Expression<Integer> x = add(literal(75),v));
v.set(110);
eval(x);
Sure Lisp doesn't make you write all the functions and classes to represent that, but the Java version is typed and can make the autocomplete in your IDE sing [3]
Forth is something special. The idea that words have a compile-time and run-time meaning is brilliant and lets you make a programming language without a lot of CS knowledge. I wrote one in high school for my TRS-80 Color Computer Computer.
I'd say though the metaprogramming techniques in On Lisp and other Lisp books are about the continental shelf but the dragon book is about the ocean.
[1] It's not quite the same as the mini-CLOS he works but I've developed so many metaobject facilities for Python using metaclasses, annotations, magic methods, etc.
[2] See the Jena rules engine for a really beautiful (would be pedagogical if it had a few more comments) implementation in Java of something Prolog-adjacent which is capable of both forwards and backwards chaining. The genius of Prolog is that it's simple to implement and with WAM faster than you would think possible even if it looks like it fell off a UFO. Try to write a lot of it though and the various methods for doing imperative programming in it look less clever than awkward, so the interesting developments are pure like Datalog.
[3] With some annoyance thanks to type erasure, you can't override a function to take an Expression<String> or an Expression<Int>, you have to unerase the types by putting them in the function names or something like that.
> Small programs are interesting. But I’m also interested in small programming languages. Generally speaking, the smaller the language, the less expressive it is.
It was the study of Kolmogorov complexity that motivated me to come up with the smallest and most expressive language for writing minimal size programs, and define the various flavors of Kolomogorov complexity in terms of it [1].
[1] https://gist.github.com/tromp/86b3184f852f65bfb814e3ab0987d8...
I read your paper and you show that a simple encoding of S and K combinators seem to have longer interpreters. However..
What I don't fully understand, the complexity measure seems to depend on the binary encoding. So in theory, you could have a primitive combinator such that the interpreter is really short to express but the combinators for pair, true and false are rather long (when expressed in that primitive), thus hiding complexity in them.
Makes me wonder, if you should only measure complexity of the interpreter combinator, and not also complexity of the "quoting" combinator, which would transform already existing encoded combinator into doubly encoded one, forcing you to represent pair, true and false combinators as well.
No; I don't think so. A iota based code (1 for application, and 0 for iota) loses out badly to the SK based code (1 for application, 00 for K, 01 for S). It would instead be better to pick slightly larger bases, picking up some elements like I,B,C, or W.
> So in theory, you could have a primitive combinator such that the interpreter is really short to express but the combinators for pair, true and false are rather long (when expressed in that primitive), thus hiding complexity in them.
Of course, we're not just optimizing for self-interpreter size, but for overall size across a large set of tasks, such as the important theorems of AIT, or such as the milestones in a busy beaver (surpassing Graham's Number / TREE(3) / Loader's Number [1]).
The terms you mention, namely pair, true and false are not tasks, but the tiniest of data structures that you commonly use in programming tasks. Optimizing just for those is not as effective as for larger tasks that need to balance lots of data types and control structures.
[1] https://oeis.org/A333479
[1] https://rosettacode.org/wiki/Universal_Lambda_Machine
One thing about small capable languages is that they're sort of proto-languages where it's used to create a specific flavour/dialect with programming abstractions which are then used to implement a program. This is true of all languages but especially the tiny ones like assembly language or Lisp. This is the reason I believe that Lisps never got more popular because each project is it's own dialect and not applicable elsewhere. It's why Clojure a larger more opinionated language with 'batteries included' is more useful elsewhere.
There are only so many language implementers, and many opportunities to innovate in language design become obvious when it's your turn to decide what to implement.
For example, I made macros and axioms first-class citizens of the language as special cases of the function type, because using an interpreter allowed me to decide how things should be evaluated at runtime. I accidentally found a trivial way to bootstrap (quote) in my language. I also found a way to make a much more general reader macro system by simply writing a modular value reader that can be extended by adding new elements to a list of parsers. This make non-lisp syntaxes like JSON much easier to implement than with the CL reader macro system where reader macros are bound to some special parameters.
Note for context: In my case, the Lisp interpreter is under 3KLoC, and most of the abstractions are bootstrapped from macros and reflection primitives, or primitive that are shadowed by higher level builtins.
Every month there would be small projects (fractals, colliding galaxies, small games) that would teach you a new subject and be small enough that you could type in.
To save folks a search:
https://beagle.applearchives.com/
Although they are implemented in OCaml, which feels like more magic than say a Forth or Lisp written in C. In C you can still "see" how it runs on hardware
I think Standard ML is actually the "usable mini ML"
These types of languages will always be bigger than Lisp/Forth/Tcl, because they require lexer /parser / type checker / GC runtime
---
Although I would like to see someone do the smallest possible C version of an ML
Something like c4 - C in 4 Functions (https://github.com/rswier/c4) - which actually has a type checker! (which is hard to see on first reading)
[1] https://gleam.run/
https://www.cs.kent.ac.uk/people/staff/dat/miranda/
and I wrote a pure, lazy, functional language and self-hosted compiler based upon Miranda:
https://github.com/taolson/Admiran
http://t3x.org/mlite/index.html
wikipedia: https://en.wikipedia.org/wiki/META_II
paper: https://hcs64.com/files/pd1-3-schorre.pdf
> Hui can list the entire interpreter in the appendix because it’s just that single page, 122 lines of incredibly cryptic C, which you can view here:
I count 41 lines, but more interestingly and by pure luck, I just happened to make a post about this less than an hour ago:
https://blog.wilsonb.com/posts/2025-06-06-readable-code-is-u...
The frontend has a self imposed line budget of 10kLOC.
[1] http://cwerg.org [2] https://github.com/robertmuth/Cwerg/blob/master/FE/Docs/tuto...
I got nerd sniped by a conversation with a colleague and am, in my spare time, working on the smallest event store database I can write.
It’s a nice way to take a break from the enterprise software world.
Using pmatch and some other tricks makes the interpreter shorter but prevents meta-circular evaluation, in which case you could have just defined the lisp interpreter as just being eval.
Now, you could take the Guy Steele "Growing a language route" and try to define a self-consistent program that defines every word that is used, but even then you will have to rely on some axioms that cannot be defined solely within the artifact. So then the question becomes what is the size of the axioms?
In other words, Kolmogorov complexity does point to a real thing -- and I strongly agree that smaller generally corresponds to lower complexity given the same set of axioms -- but the deeper sense of complexity seems inherently unmeasurable to me.
Or put yet another way, complexity can only measured relationally.
Of course, every human definition will similarly hide complexity to some extent, so there's no real way to win.
On TCL, jimtcl it's small but it can be really powerful to build CLI/TUI tools. It has TLS support and, of course, Sqlite3 support.
https://github.com/codr7/shi
That for Scheme; under Common Lisp loop it's hell and, well, on PAIP you implement a Scheme Lisp interpreter.
Heck, on Scheme you need to understand eval and apply and how do they operate recursively.
More? Both Scheme and CL have books exercises on implementing a mini-Prolog.
On Forth: immediate. You have a raw Forth core, implement do...loop and if, then, else. Have fun.
I really enjoyed the beginning of Graham's On Lisp but walked away disappointed from the end in that he avoided any really difficult metaprogramming problems, most of the time when he's using macros it is for performance, usually he has an example that uses plain function pointers. Except for the continuation example you can work his examples in Python [1] including the query language and the mini-Prolog [2], but Python already has yield, async, await, etc.
If you like S-expressions you can go to town building out classes full of static methods in Java that let you write things like
Sure Lisp doesn't make you write all the functions and classes to represent that, but the Java version is typed and can make the autocomplete in your IDE sing [3]Forth is something special. The idea that words have a compile-time and run-time meaning is brilliant and lets you make a programming language without a lot of CS knowledge. I wrote one in high school for my TRS-80 Color Computer Computer.
I'd say though the metaprogramming techniques in On Lisp and other Lisp books are about the continental shelf but the dragon book is about the ocean.
[1] It's not quite the same as the mini-CLOS he works but I've developed so many metaobject facilities for Python using metaclasses, annotations, magic methods, etc.
[2] See the Jena rules engine for a really beautiful (would be pedagogical if it had a few more comments) implementation in Java of something Prolog-adjacent which is capable of both forwards and backwards chaining. The genius of Prolog is that it's simple to implement and with WAM faster than you would think possible even if it looks like it fell off a UFO. Try to write a lot of it though and the various methods for doing imperative programming in it look less clever than awkward, so the interesting developments are pure like Datalog.
[3] With some annoyance thanks to type erasure, you can't override a function to take an Expression<String> or an Expression<Int>, you have to unerase the types by putting them in the function names or something like that.