Refactoring and Type Errors in Clojure

DZone 's Guide to

Refactoring and Type Errors in Clojure

The author shares his experience with refactoring Clojure code and identifies potential time-consuming pitfalls to avoid.

· Java Zone ·
Free Resource

While refactoring relatively simple Clojure code to use a map instead of a vector, I have wasted perhaps a few hours due to essentially type errors. I want to share the experience and my thoughts about possible solutions since I encounter this problem quite often. I should mention that it is quite likely that it is more a problem (an opportunity? :-)) with me rather than the language, namely with the way I write and (not) test.

The core problem is that I write chains of transformations based on my sometimes flawed idea of what data I have at each stage. The challenge is that I cannot see what the data is and have to maintain a mental model while writing the code, and I suck at it. Evaluating the code in the REPL as I develop it helps somewhat but only when writing it – not when I decide to refactor it.

A typical example of error-prone code:

(->> data
     expr1  ; this is often a core function, a keyword, sometimes my own fn
     exprI  ; <-- suddenly I get here a nil, an empty seq, an exception, or just miss the expected data

I have struggled with the question of how to do a refactoring of data in Clojure in a safer way before and got some useful tips.)

Solutions for Preventing Type Errors

I see three possible solutions:

  1. Make it possible to see the data as it flows through the transformation, i.e. the last data produced by each expression during the last execution. LightTable's watches and insta-repl can do this and J.P. Posma called this "omniscient debugging" and demonstrated how it can be done in JavaScript in his StrangeLoop talk Visualising program execution. This is how programming should work anyway. (Preferably with having a control over what/how data is shown and being able to interact with it to explore it.) However I haven't managed to get this working in LT with my ClojureScript project (due to a weird "Can't dynamically bind non-dynamic var: clojure.tools.reader/resolve-symbol at line 1 :: {:file nil, :line 1, :column 61, :tag :cljs/analysis-error").
  2. Use core.typed to define the shape of the data and let the type checker verify that my code conforms to it at each stage. Core.typed has its own challenges but in this case, it would help me nicely. Sadly the ClojureScript integration is currently not up to date. Alternatively, I could use Prismatic Schema, which makes it possible to specify & check the expected shape of the data but can't, I believe, really help to check that each step of a transformation makes sense.
  3. Develop a little differently. David Nolen has recommended the following approach that I could try to apply more though it doesn't feel 100% suitable solution for the particular problem I have encountered: Try something at the REPL first, then and pre/post-conditions around that, but do not get crazy with :pre/post, f.ex. only add them at entry points. And of course, have a good dose of cljs.test. David wrote a short blog series about this: Life with Dynamic TypingFaster Validation Through Immutability, and Lazy Contracts in 30 lines.

Let's have a look at my data, code, and the flawed refactoring.

The Data

Input (simplified):

(def hardwareCatalog {
    :productId111 {
        :type "ACCESSORY"
        ; ...
        :variations {
            :111222 {
                :sku 111222
                :name "Awesome Watch"
    ; ...
(def availability {:111222 { :availableQuantity 0, #_(...) }})

Expected output:

([:7056418 "Mordor Design Flipcover Xperia Z3 Bloody" 0]
 [:7056419 "Mordor Design Flipcover Xperia Z3 Dark" 3]

The Original, Vector-based Code

We want to provide a list of variations with their SKU (an ID), name, and available quantity.

(def hardwareCatalog ...)
(def availability ...)
(def low-stock-treshold 10)

(defn- ->sku+name [hardwareCatalog]
  (->> hardwareCatalog
       (mapcat :variations)
       (map (fn [[sku {:keys [name]}]] [sku name]))))

(defn low-on-stock [hardwareCatalog availability]
  (->> hardwareCatalog
       (map (fn [[sku _ :as row]]
              (conj row
                    (->> availability
       (filter (fn [[_ _ stock]] (<= stock low-stock-treshold)))
       (sort-by second)))

(comment (low-on-stock hardwareCatalog availability))

The Bug-ridden (Partial) Refactoring

(defn ->vec
    "Helper to change a map back to a vector so we can refactor stepwise"
    [keys vec-or-map]
 (if (map? vec-or-map)
   ((apply juxt keys) vec-or-map)

(defn- ->sku+name [hardwareCatalog]
 (->> hardwareCatalog
      ; WAS: (mapcat :variations) (map (fn [[sku {:keys [name]}]] [sku name]))
          #(map (fn [variation] (merge variation (select-keys % [:type])))
                (:variations %))) ; <-- MISTAKE 1: I forgot that variations isn't
                                  ; sequential but a map; should be wrapped in (vals ...)
      (map #(->vec [:sku :name] %))
      (map (fn [[sku {:keys [name]}]] [sku name]))
      #_(map #(select-keys % [:sku :name :type])) ;; replace the prev 2 lines with this when :type fixed

(defn low-on-stock [hardwareCatalog availability]
 (->> hardwareCatalog
      (->vec [:sku :name]) ; <--- MISTAKE 2: It should be (map ->vec ..) but doesn't fail, leading to an error later
            ; it should have been (map ->vec)
      (map (fn [var] (assoc var
                       (->> availability
                            (:sku var) ; <-- MISTAKE: After fixing #1 this will stop working, yielding nil everywhere
                                       ; it should have become ((keyword (:sku var)))
      (filter #(<= (:availableQuantity %) low-stock-treshold))
      (sort-by :name)
      #_(map #(->vec [:sku :name :availableQuantity] %))))

The Problems:

  1. I forgot that variations were a map, not a sequence. The code "worked" but the :type wasn't added to the variations. Spotting why was made even more difficult by having limited print to max 10 items and depth 3 (to prevent timeouts) and thus not printing the keys and structures relevant for the mistake.
  2. Due to the inconsistency between ->sku+name taking a collection and ->vec taking a single item, I've mistakenly used (->vec) instead of (map ->vec). Again the code "worked" but did not do the expected thing so after fixing #1 and sending through a map, the code started to fail at another place and I couldn't figure out why.
  3. After fixing #1 and #2, the transformation again stopped doing the expected thing and all availableQuantities were nil. It wasn't easy to spot that it was because I have switched from using :<sku> to just <sku> (we could blame it on inconsistency in data, where SKUs used as keys have been turned into keywords but not those used as values).


Due to the use of nil as widely acceptable Null Object and the fact that many functions work on different types of inputs, mistakes in Clojure can travel far and manifest in unexpected ways and places. That is a problem for people like me who aren't good at maintaining a correct mental model of the data as it is being transformed.

The problem could be mitigated by providing more visibility and displaying the data produced by each expression and making it possible to explore it interactively (as you can with a console.log-ed object in a browser's Console). It could be also mitigated by expressing your expectations about the shape of data and letting the computer verify regularly that they hold, using core.typed or pre- and post-conditions, Prismatic Schema etc.

clojure, java, refactoring

Published at DZone with permission of Jakub Holý , DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}