The 185-Microsecond Type Hint

(blog.sturdystatistics.com)

54 points | by kianN 6 hours ago

3 comments

zahlman 1 hour ago
> the compiler had enough static information to emit a single arraylength bytecode instruction.
I'm skeptical.
If it can prove that the input actually matches the hint, then why does it need the hint?
If it can't, what happens at runtime if the input is something else?
> We replaced a complex chain of method calls with one CPU instruction.
JVM bytecodes and CPU instructions really shouldn't be conflated like that, although I assume this was just being a bit casual with the prose.
[-]
- mkmccjr 1 hour ago
  Thank you for this! On the second point, you are absolutely correct; that was sloppy writing on my part. I will correct that in the post.
  I'm not certain I understand your first point. When I add the type hint, it's me asserting the type, not the compiler proving anything. If the value at runtime isn't actually a byte array, I would expect a ClassCastException.
  But I am new to Clojure, and I may well be mistaken about what the compiler is doing.
  [-]
  - zahlman 13 minutes ago
    I mean, I think that probably is what happens. But then, while it's sped up a lot, the generated bytecode presumably also includes instructions to try the cast and raise that exception when it fails.
    (And the possible "JIT hotpath optimization" could be something like, at the bottom level, branch-predicting that cast.)
EdNutting 6 hours ago
Wild speculation: Could the extra speedup be due to some kind of JIT hotpath optimisation that the previous reflective non-inlinable call prevented, and which the new use of the single `arrayLength` bytecode enabled? E.g. in production maybe you're seeing the hotpath hit a JIT threshold for more aggressive inlinng of the parent function, or loop unrolling, or similar, which might not be triggered in your test environment (and which is impossible when inlining is prevented)?
[-]
- mkmccjr 4 hours ago
  Author of the blog post here. That explanation sounds very plausible to me!
  If the whole enclosing function became inlinable after the reflective call path disappeared, that could explain why the end-to-end speedup under load was even larger than the isolated microbench.
  I admit that I don't understand the JIT optimization deeply enough to say that confidently... as I mentioned in the blog post, I was quite flummoxed by the results. I’d genuinely love to learn more.
TacticalCoder 55 minutes ago
> When a client asks for the time, it sends a random nonce. The server replies with a signed certificate containing both the nonce and a timestamp, proving the response happened after the request.
Oh that's cool. Apparently one of the protocol's goal is to catch lying parties and to prove they were lying about the (rough) time.