Early termination of transducers and reducing functions

Early termination of transducers and reducing functions


In the previous post about transducers, I did not discuss early termination of reducing functions and transducers. For the examples given in that post, early termination was irrelevant. It is, however, an important and tricky aspect of reducing functions and transducers.



Early termination of reducing functions

The following reducing function is a variant on the function first and the transducer (take 1), created specially for people with nostalgia for the 80s.

(defn highlander-rf
  "There can be only one"
  [_ input] (reduced input))
Enter fullscreen mode

Exit fullscreen mode

Although this function is largely useless in practice, it provides the most minimalist example of early termination. It takes an intermediate result and an input value, and returns this input value wrapped with a special object. Functions that use this reducing function will recognize this special object and know that they shouldn’t provide any more input to this function.

The function reduce is one of those functions that recognize the wrapper object:

(reduce highlander-rf nil [1 2 3 4]) ;; Evaluates to 1
Enter fullscreen mode

Exit fullscreen mode

The function reduce will call highlander-rf with nil and 1 as arguments, and then won’t call it any more.

This mechanism makes it possible to efficiently reduce large or even infinite collections. Once all relevant input has been processed, all irrelevant input that follows can be ignored.



Using ensure-reduced inside transducers

The following transducer has the same behavior as the reducing function above:

(defn highlander-tr
  "There can be only one"
  [rf]
  (fn
    ([] (rf))
    ([result] (rf result))
    ([result input] (ensure-reduced (rf result input)))))
Enter fullscreen mode

Exit fullscreen mode

The two-arity variant of this transducer takes a value as input and returns this value wrapped in the same special object again. It uses ensure-reduced to do this, instead of reduced. When implementing highlander-tr, it’s not known what reducing function rf it will be used with. It could be that some of these functions will return a result that is already wrapped. The function ensure-reduced will ensure that an already wrapped result is not wrapped again. If the return value of (rf result input) is already wrapped using reduced, the result will be returned as-is. If it is not wrapped, ensure-reduced will wrap it.

The following example demonstrates the behavior of highlander-tr:

(into []
      highlander-tr
      [1 2 3 4]) ;; Evaluates to [1]
Enter fullscreen mode

Exit fullscreen mode



Using reduced? inside transducers

The following reducer does a little more than the one we just saw:

(defn broken-stutter [rf]
  (fn
    ([] (rf))
    ([result] (rf result))
    ([result input]
     (rf (rf result input) input))))
Enter fullscreen mode

Exit fullscreen mode

This transducer will output each value it receives as input twice. As its name suggests, it is broken. Unfortunately, that is not that easy to notice.

Evaluating the following example leads to the result you’d expect:

(transduce
 broken-stutter
 conj
 []
 [1 2 3 4 5 6]) ;; Evaluates to [1 1 2 2 3 3 4 4 5 5 6 6]
Enter fullscreen mode

Exit fullscreen mode

When used together with the following reducing function that terminates early, however, the transducer will not always behave as expected:

(defn limited-conj [n]
  (fn 
    ([result] result)
    ([result input]
     (if (>= (count result) n)
       (reduced result)
       (conj result input)))))
Enter fullscreen mode

Exit fullscreen mode

The reducing function above behaves the same as conj, until the collection passed as intermediate result contains at least n values. Once that threshold has been reached, the intermediate result is returned as final result.

Combining the reducing function limited-conj with the transducer broken-stutter can bring the issue with broken-stutter to the surface. The following expression cannot be evaluated, for example:

(transduce
 broken-stutter
 (limited-conj 6)
 []
 [1 2 3 4 5 6])
Enter fullscreen mode

Exit fullscreen mode

The following expression can be evaluated, however, which highlights why dealing with early termination can be tricky:

(transduce
 broken-stutter
 (limited-conj 5)
 []
 [1 2 3 4 5 6]) ;; Evaluates to [1 1 2 2 3]
Enter fullscreen mode

Exit fullscreen mode

The problem with broken-stutter is that it may apply the reducing function rf to a result that is already wrapped using reduced. The following variant fixes that issue:

(defn stutter [rf]
  (fn
    ([] (rf))
    ([result] (rf result))
    ([result input]
     (let [intermediate (rf result input)]
       (if (reduced? intermediate)
         intermediate
         (rf intermediate input))))))
Enter fullscreen mode

Exit fullscreen mode

This variant checks whether the intermediate result is already reduced and does not apply rf if that is the case.

(transduce
 stutter
 (limited-conj 6)
 []
 [1 2 3 4 5 6]) ;; Evaluates to [1 1 2 2 3 3]
Enter fullscreen mode

Exit fullscreen mode



Using unreduced inside transducers

There’s one more scenario regarding early termination that I’d like to describe. Because this scenario is also tricky, we need another reducing function that terminates early to demonstrate it. The following function is taken from ClojureDocs:

(defn conj-till-odd
  ([coll] coll)
  ([coll x] (cond-> (conj coll x)
              (odd? x) reduced)))
Enter fullscreen mode

Exit fullscreen mode

It behaves as conj until an odd value is received as input.

The following stateful transducer outputs each input value it receives and stores the last one. Once all input has been processed, the last input value is added to the output again.

(defn repeat-last [rf]
  (let [pv (volatile! nil)]
    (fn
      ([] (rf))
      ([result]
       (if-let [p @pv]
         (unreduced (rf result p))
         (rf result)))
      ([result input]
       (let [result (rf result input)]
         (vreset! pv input)
         result)))))
Enter fullscreen mode

Exit fullscreen mode

This transducer uses unreduced to ensure that the final result returned by the single-arity variant of this function is not wrapped with reduced. It could be that the result of (rf result p) is wrapped with reduced. If that is the case, this wrapper should be removed because it has no place in the output of the transducer.

If a reducing function or the two-arity variant of a transducer returns a value wrapped with reduced, this wrapper is only intended to signal that the reducing function or transducer will not process any more input values. Whoever is using the reducing function or transducer should stop feeding it more input values and remove the wrapper from the final result. The transducer stutter demonstrates the first action, and the transducer repeat-last demonstrates the second action.

The following example illustrates the use of repeat-last in combination with conj-till-odd:

(transduce
 repeat-last
 conj-till-odd
 []
 [2 4 3 2]) ;; Evaluates to [2 4 3 3]
Enter fullscreen mode

Exit fullscreen mode



Conclusion

See https://github.com/ljpengelen/transduce-shakespeare/tree/main/transduce-clj for a Clojure project containing some expressions that illustrate the concepts described in this post.



Source link
lol

By stp2y

Leave a Reply

Your email address will not be published. Required fields are marked *

No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.