Clojure: Transducers

(clojure.org)

76 points | by tosh 2 days ago

9 comments

  • drob518 41 minutes ago
    Transducers work even better with a Clojure library called Injest. It has macros similar to the standard Clojure threading macros except Injest’s macros will recognize when you’re using transducers and automatically compose them correctly. You can even mix and match transducers and non-transducer functions and Injest will do its best to optimize the sequence of operations. And wait, there’s more! Injest has a parallelizing macro that will use transducers with the Clojure reducers library for simple and easy use of all your cores. Get it here: https://github.com/johnmn3/injest

    Note: I’m not the author of Injest, just a satisfied programmer.

  • adityaathalye 1 hour ago
    May I offer a little code riff slicing FizzBuzz using transducers, as one would do in practice, in real code (as in not a screening interview round).

    Demo One: Computation and Output format pulled apart

      (def natural-nums (rest (range)))
    
      (def fizz-buzz-xform
        (comp (map basic-buzz)
              (take 100))) ;; early termination
    
      (transduce fizz-buzz-xform ;; calculate each step
                 conj ;; and use this output method
                 []   ;; to pour output into this data structure
                 natural-nums)
    
      (transduce fizz-buzz-xform ;; calculate each step
                 str ;; and use this output method
                 ""  ;; to catenate output into this string
                 natural-nums) ;; given this input
    
      (defn suffix-comma  [s]  (str s ","))
    
      (transduce (comp fizz-buzz-xform
                       (map suffix-comma)) ;; calculate each step
                 str ;; and use this output method
                 ""  ;; to catenate output into this string
                 natural-nums) ;; given this input
    
    Demos two and three for your further entertainment are here: https://www.evalapply.org/posts/n-ways-to-fizzbuzz-in-clojur...

    (edit: fix formatting, and kill dangling paren)

  • bjoli 1 hour ago
    I made srfi-171 [0], transducers for scheme. If you have any questions about them in general I can probably answer them. My version is pretty similar to the clojure version judging by the talks Rich Hickey gave on them.

    I know a lot of people find them confusing.

    0: https://srfi.schemers.org/srfi-171/srfi-171.html

  • talkingtab 21 minutes ago
    When I first read about transducers I was wowed. For example, if I want to walk all the files on my computer and find the duplicate photos in the whole file system, transducers provide a conveyor belt approach. And whether there are saving in terms of memory or anything, maybe. But the big win for me was to think about the problem as pipes instead of loops. And then if you could add conditionals and branches it is even easier to think about. At least I find it so.

    I tried to implement transducers in JavaScript using yield and generators and that worked. That was before async/await, but now you can just `await readdir("/"); I'm unclear as to whether transducers offer significant advantages over async/await?

    [[Note: I have a personal grudge against Java and since Clojure requires Java I just find myself unable to go down that road]]

  • pjmlp 37 minutes ago
    Nowadays you can make use of some transducers ideas via gatherers in Java, however it isn't as straightforward as in plain Clojure.
  • thih9 1 hour ago
    • adityaathalye 1 hour ago
      I'd reckon most of Clojure is from ten years ago. Excellent backward compatibility, you see :) cf. https://hopl4.sigplan.org/details/hopl-4-papers/9/A-History-...
    • whalesalad 45 minutes ago
      It's a blessing and a curse that zero innovation has occurred in the Clojure space since 2016. Pretty sure the only big things has been clojure.spec becoming more mainstream and the introduction of deps.edn to supplant lein. altho I am still partial to lein.
  • eduction 1 hour ago
    The key insight behind transducers is that a ton of performance is lost not to bad algorithms or slow interpreters but to copying things around needlessly in memory, specifically through intermediate collections.

    While the mechanics of transducers are interesting the bottom line is they allow you to fuse functions and basic conditional logic together in such a way that you transform a collection exactly once instead of n times, meaning new allocation happens only once. Once you start using them you begin to see intermediate collections everywhere.

    Of course, in any language you can theoretically do everything in one hyperoptimized loop; transducers get you this loop without much of a compromise on keeping your program broken into simple, composable parts where intent is very clear. In fact your code ends up looking nearly identical (especially once you learn about eductions… cough).

    • fud101 52 minutes ago
      These sound wild in terms of promise but I never understood them in a practical way.
      • moomin 47 minutes ago
        They're not really that interesting. They're "reduce transformers". So, take a reduction operation, turn it into an object, define a way to convert one reduction operation into another and you're basically done. 99% of the time they're basically mapcat.

        The real thing to learn is how to express things in terms of reduce. Once you've understood that, just take a look at e.g. the map and filter transducers and it should be pretty obvious. But it doesn't work until you've grasped the fundamentals.

      • eduction 12 minutes ago
        Canonical example is rewriting a non transducing set of collection transformations like

           (->> posts
              (map with-user)
              (filter authorized?)
              (map with-friends)
              (into []))
        
        That’s five collections, this is two, using transducers:

            (into []
                  (comp
                    (map with-user)
                    (filter authorized?)
                    (map with-friends))
                  posts)
        
        A transducer is returned by comp, and each item within comp is itself a transducer. You can see how the flow is exactly like the double threading macro.

        map for example is called with one arg, this means it will return a transducer, unlike in the first example when it has a second argument, the coll posts, so immediately runs over that and returns a new coll.

        The composed transducer returned by comp is passed to into as the second of three arguments. In three argument form, into applies the transducer to each item in coll, the third argument. In two argument form, as in the first example, it just puts coll into the first argument (also a coll).

  • mannycalavera42 2 hours ago
    transducers and async flow are :chefkiss
  • faraway9911 1 hour ago
    [dead]