A Polyglot's Guide to Multiple Dispatch Part 4: Clojure
The conclusion of this series about multiple dispatch focuses on using the technique in Clojure.
Join the DZone community and get the full member experience.
Join For FreeThis is part 4 in the series of articles on multiple dispatch. In previous parts we've seen what multiple dispatch is and how it can be emulated in C++ and Python. We've also seen a programming language with first-class support for multiple dispatch: Common Lisp. In this part, I'd like to discuss another language where multiple dispatch is in the standard toolbox: Clojure.
Clojure is a relatively new language but it has ancient roots. It was designed in 2007 by Rich Hickey, and has been rapidly gaining popularity in the last few years as a modern, concurrent, and all-around exciting Lisp running on top of the JVM. Clojure has a large, friendly, vibrant user community and many talented developers hacking on it. I certainly harbor the hope that Clojure is what's going to finally bring Lisp mainstream... but I digress.
This article shows how multiple dispatch is done in Clojure; to prepare the ground, I'll start with single dispatch.
Single Class-Based Dispatch
Dispatching based on some property of a value in Clojure is done using the defmulti / defmethod macro duo. These macros are sort-of, but not exactly, the equivalents of defgeneric / defmethod in Common Lisp, and the difference will be apparent very soon.
defmulti declares a "multi-method" and sets its dispatch function, which is an arbitrary Clojure function. At runtime, whenever the multi-method is called, the dispatch function is first invoked on its arguments, and the value it returns is used to choose which method to call. A simple example should help clarify this:
; Single-dispatch multimethod, dispatching on the class of the argument.
; Here 'class' refers to the built-in function clojure.core/class
(defmulti describe-thing class)
; Define dispatcher methods for built-in Long and String values.
(defmethod describe-thing java.lang.Long
[thing] (println "a Long" (str thing)))
(defmethod describe-thing java.lang.String
[thing] (println "a String" (str thing)))
We define the multi-method describe-thing and then two implementations, one for the built-in Long type, another for the built-in String type. Here it is in action:
multi.core=> (describe-thing 12)
a Long 12
nil
multi.core=> (describe-thing "tim")
a String tim
nil
We can also define methods for custom types:
; Define a custom class and add a dispatcher for it.
(defrecord Person [name phone])
(defmethod describe-thing Person
[thing] (println "a Person with name" (:name thing)))
There are several ways to define new types in Clojure. Here I'm using defrecord, which is somewhat similar to a C++ struct (and a Common Lisp class). Now the right describe-thing is called when invoked with an instance of Person:
multi.core=> (describe-thing (Person. "joe" 32))
a Person with name joe
Single Value-Based Dispatch and Duck Typing
So far so good. Note, however, that I said the dispatch function can be arbitrary. Indeed, in the example above we've used class as the dispatch function, but we could use something else. In fact, we could use any custom function. As another example, here's dispatch based on a field value:
(defmulti promotion-due :position)
(defmethod promotion-due :engineer
[emp] (> (:lines-of-code emp) 100000))
(defmethod promotion-due :manager
[emp] (> (:num-reports emp) 10))
; Works with records
(defrecord Employee [name position num-reports lines-of-code])
Forgive my rather simplistic exposition of how merit-based promotion in the tech industry works. Rather, note how cleanly this code avoids a position-based switch that would be the ordinary solution to this problem, essentially delegating the switch to Clojure itself. Let's try with a couple different employees:
=> (promotion-due (Employee. "jim" :manager 12 0))
true
=> (promotion-due (Employee. "sue" :engineer 0 98000))
false
What's the dispatch function here? It's :position. In Clojure, keywords in the first position in a form are taken as functions that access the key named by the keyword from values that support it. It works fine for records. It also works for maps, though. Can promotion-due work on maps?
=> (def joe {:name "joe", :position :manager, :num-reports 9})
=> (promotion-due joe)
false
=> (def tim {:name "tim", :position :engineer, :lines-of-code 124000})
=> (promotion-due tim)
true
Wowza! So Clojure multi-methods don't actually care what the class of the value they dispatch upon is. Indeed, they don't even care whether it's a custom class at all. As long as the dispatch function can be successfully called on the value and returns something that can be dispatched upon, multi-methods work. This is a pretty cool form of duck typing - a programming technique recently popularized by Python but also available in C++ at compile-time with templates.
In fact, the "Clojure way" of programming is to do as much duck typing as possible, on built-in types. If entities can be represented as combinations of vectors and maps, that's great because it means that in addition to custom functions we write to operate on these "types", we can also use all the other functions in our arsenal that work on built-in types. The philosophy, according to the designers of Clojure is: rather than use many different classes with a few methods on each, use a small number of data structures with many different functions that act on them [1].
In any case, as we've just seen, Clojure's records have some map-like behaviors that make it easy to intermix methods operating on them with methods operating on regular maps.
The sample code shown here uses :position as the dispatch function. An alternative method of achieving the same would be more explicit:
(defmulti promotion-due
(fn [emp]
(:position emp)))
The previous form is preferred because it's more concise. However, this variant is highlighting the possibility of using any custom function as the dispatcher. For example, it could return a pair of values (position and department, say) and then we'd write methods to dispatch on both. Try it as an exercise.
Multiple Dispatch
The last paragraph serves as a segue to multiple dispatch. In the general case, Clojure methods accept multiple arguments (just as regular functions do). All the arguments are passed into the dispatch function, which can then return a tuple. Here's our shape intersection example in Clojure :
(deftype Shape [])
(deftype Rectangle [])
(deftype Ellipse [])
(deftype Triangle [])
(defmulti intersect
(fn [a b]
[(class a) (class b)]))
(defmethod intersect [Rectangle Ellipse]
[r e] (printf "Rectangle x Ellipse [names r=%s, e=%s]\n"
(class r) (class e)))
(defmethod intersect [Rectangle Rectangle]
[r1 r2] (printf "Rectangle x Rectangle [names r1=%s, r2=%s]\n"
(class r1) (class r2)))
(defmethod intersect [Rectangle Shape]
[r s] (printf "Rectangle x Shape [names r=%s, s=%s]\n"
(class r) (class s)))
The dispatch function here takes both arguments and returns a vector of their classes. We can then dispatch upon this vector to define methods for handling different pairs of shapes. Let's try it:
=> (intersect (Rectangle.) (Ellipse.))
Rectangle x Ellipse [names r=class multi.shapes_double_dispatch.Rectangle, e=class multi.shapes_double_dispatch.Ellipse]
nil
=> (intersect (Rectangle.) (Rectangle.))
Rectangle x Rectangle [names r1=class multi.shapes_double_dispatch.Rectangle, r2=class multi.shapes_double_dispatch.Rectangle]
nil
=> (intersect (Rectangle.) (Shape.))
Rectangle x Shape [names r=class multi.shapes_double_dispatch.Rectangle, s=class multi.shapes_double_dispatch.Shape]
So far so good. But now let's try to invoke a base-class default. We didn't define a method to intersect rectangles with triangles, so we'd like the [Rectangle Shape] variant to be invoked here:
=> (intersect (Rectangle.) (Triangle.))
IllegalArgumentException No method in multimethod 'intersect' for dispatch value: [...]
Oops... What did we expect, though? Nothing in our code even tells Clojure that Triangle and Shape are in any way related! How do we do that? deftype won't let us specify base classes. To understand how inheritance hierarchies work in Clojure, we have to take a short detour.
Clojure's Ad-Hoc Inheritance with derive and isa?
Clojure's approach to inheritance is very unusual. Some call it abstract, and some call it "ad-hoc". You'll soon understand why.
Typically, in most languages, inheritance relations are defined between classes at the point where these classes are defined. In Clojure, the concept of an inheritance hierarchy is completely detached from any particular class mechanism—it lives on its own. We define inheritance relationships between abstract "tags," which are customarily either (namespace-qualified) symbols or keywords.
Let's define a simple animal hierarchy to demonstrate [3]:
multi.core=> (derive ::dog ::mammal)
nil
multi.core=> (derive ::cat ::mammal)
nil
multi.core=> (derive ::husky ::dog)
nil
Now we can do some queries:
multi.core=> (parents ::husky)
#{:multi.core/dog}
multi.core=> (ancestors ::husky)
#{:multi.core/mammal :multi.core/dog}
multi.core=> (descendants ::mammal)
#{:multi.core/cat :multi.core/husky :multi.core/dog}
As well as use isa? to answer the customary is-a test of object relationships:
multi.core=> (isa? ::dog ::mammal)
true
multi.core=> (isa? ::husky ::dog)
true
multi.core=> (isa? ::husky ::cat)
false
Another nice feature isa? has is accepting sequences of types and combining a per-element answer with a kind-of every?:
; isa? ::husky ::dog AND isa? ::husky ::cat
multi.core=> (isa? [::husky ::husky] [::dog ::cat])
false
; isa? ::husky ::dog AND isa? ::husky ::mammal
multi.core=> (isa? [::husky ::husky] [::dog ::mammal])
true
A final piece of the puzzle we need to understand to get back to our shape intersections is that multi-methods use isa? rather than = to dispatch. In other words, when the dispatch function is called and returns a result, to find a method that should be invoked Clojure calls (isa? actual-value expected-value) for each method's expected value to find a match. (isa? x y) always returns true if (= x y), so exact matches work as expected.
Mutliple Dispatch With Base-Class Defaults
Now we have all we need to make that intersection between a rectangle and a triangle dispatch as expected. Just for kicks, in this sample I'll forego creating new types with deftype and defrecord and will use maps instead:
(derive ::rectangle ::shape)
(derive ::ellipse ::shape)
(derive ::triangle ::shape)
; Helper constructors to create new "instances" of shapes. In a real program
; they could take parameters that would be assigned to fields.
(defn make-shape [] {:kind ::shape})
(defn make-rectangle [] {:kind ::rectangle})
(defn make-ellipse [] {:kind ::ellipse})
(defn make-triangle [] {:kind ::triangle})
Shapes are represented by maps. A special key named :kind holds the kind of the shape, which is a symbol participating in an inheritance hierarchy we've defined. The intersect dispatch is now slightly different, as are the methods:
(defmulti intersect
(fn [a b]
[(:kind a) (:kind b)]))
(defmethod intersect [::rectangle ::ellipse]
[r e] (printf "Rectangle x Ellipse"))
(defmethod intersect [::rectangle ::rectangle]
[r1 r2] (printf "Rectangle x Rectangle"))
(defmethod intersect [::rectangle ::shape]
[r s] (printf "Rectangle x Shape"))
Let's try some intersections:
=> (intersect (make-rectangle) (make-ellipse))
Rectangle x Ellipse
=> (intersect (make-rectangle) (make-rectangle))
Rectangle x Rectangle
=> (intersect (make-rectangle) (make-shape))
Rectangle x Shape
=> (intersect (make-rectangle) (make-triangle))
Rectangle x Triangle
Now the last intersect call works as expected - picking up the ::rectangle ::shape method variant, since ::triangle derives from ::shape. When intersect is called on a rectangle and a triangle, the return value of the dispatch function is ::rectangle ::triangle, which passes the isa? test on ::rectangle ::shape.
The Flexibility of Clojure's Dispatch Mechanism
Clojure's dispatch mechanism is very general, since the dispatch function is arbitrary. In part 3 we've seen how multi-methods in CLOS work: they either directly take a class to dispatch on or have an eql comparison to dispatch by-value. Clojure generalizes that—the dispatch function can do whatever it wants, as far as Clojure is concerned—it's completely unconstrained.
With added flexibility come costs, though. First and foremost, this lets us write obscure code where dispatches are made on runtime-computed values, making the code harder to reason about. I'd strongly advise against these techniques, leaving them for the inevitable 1% of the cases. For 99%, simple isa-based dispatch should be what we're looking for.
The second cost is performance. Since every call to a method has to go through the dispatch mechanism first, these pathways better be well optimized. Deferring the decision so far into the run-time makes it more difficult for the compiler to optimize. The Clojure designers had this concern in mind when they exposed protocols—a sort-of Java-like virtual dispatch mechanism that is much faster, but has some limitations compared to general multi-methods. I'll cover Clojure protocols in a future post.
Thoughts On the Ad-Hoc Inheritance Hierarchy
On one hand, Clojure's way of defining inheritance hierarchies is markedly weird. Why can't we just tag the base classes on type definitions, the way it works in other languages (even in CLOS)? Why do we have to provide another set of definitions in parallel with the actual type definitions? Isn't this error prone and overly verbose?
Well, yes. It's a bit error prone (though I find it hard to imagine it can lead to deep bugs in the presence of even basic testing). It's also more verbose than strictly needed - just a list of bases in deftype would be shorter, keeping related things together.
But thinking about it another way, Clojure's choice actually makes sense. Unlike other languages, Clojure really isn't centered around objects. In fact, objects and custom types are probably the last tools you should be reaching for, after all other possibilities have been exhausted. As a couple of code samples in this article show, Clojure much prefers working with simple maps and compositions of built-in data types, since then the whole capabilities of the Clojure standard library are applicable to values, so we have to write less code all in all. In this light, the detached hierarchy-definition tools appear to be a very reasonable approach. In Clojure you can define new "types" with maps, with deftype, with defrecord—as well as a couple of other ways I don't cover here. Why implement the same base-class capabilities for each, when an orthogonal way of defining inheritance could be created which would then be applicable to all these approaches?
In summary, though I admit the Clojure way here is unusual, I think that overall it aligns itself well with the philosophy of the language and guides developers to think about more Clojure-y solutions to problems rather than reaching for the tools they're familiar with from traditional OO languages like Java.
Conclusion
This concludes the article, as well as the "Polyglot's Guide to Multiple Dispatch" series. It was a fun ride, and I hope it was interesting for readers to see a powerful programming technique explained and demonstrated in four different programming languages. At first—when looking at it from the point-of-view of C++ and Python—multiple dispatch appears to be an advanced feature of OOP. I think that the Common Lisp and Clojure samples demonstrate that it goes well beyond that. In fact, multiple dispatch is best categorized as a generalization of OOP, being more in the domain of generic programming. Kinda like the generic programming of C++, just at runtime instead of being limited to compile-time constructs and metaprogramming.
[1] When "plain" maps are used for data structures they should be still encapsulated behind functions to decouple their implementation from the interface exposed to clients.
[2] Note that deftype is used here, instead of defrecord. Choosing between the two is a matter of style; IMHO when the type has no fields, a deftype conveys the meaning in a clearer way.
[3] The Clojure syntax ::foo stands for namespace-qualified keyword. Also, derive and friends (as well as multi-methods) accept an optional hierarchy map to which to add relationships. By default a global map is used, and it's good enough for our purposes in this article.
[4] Another optimization of multi-methods in Clojure is that they're not implemented in pure Clojure as a set of macros (the way CLOS is done in Common Lisp). Rather, they dip into the Java innards of Clojure's implementation.
Published at DZone with permission of Eli Bendersky. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments