Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

High-Performance Immutable Business Objects

DZone's Guide to

High-Performance Immutable Business Objects

Immutable objects make concurrent access much easier. Check out this rundown of performance on a multi-threaded system.

· Performance Zone
Free Resource

Download our Introduction to API Performance Testing and learn why testing your API is just as important as testing your website, and how to start today.

The advantages of immutable objects in the area of simplifying concurrent access are well-known.  However, it is often asserted that, due to other disadvantages, immutable business objects are not suitable for large-scale enterprise software systems (where, in the conceptual realm, objects are highly mutable).

In this paper we will examine performance of a multi-threaded system of nested business objects with two implementations, one using traditional mutable objects with critical sections protected by a locking mechanism, and another using business objects that are immutable, where updates are effected via copying.

We will make use of a modern programming language (Scala) that readily facilitates  immutable development.

As Roland Kuhn has noted, no real-world system that actually models changing state can be entirely immutable. The approach taken here is that the business-object holders, caches, are mutable, while the business objects themselves are not. This has proven itself to be a model with many advantages.

(In a simpler system with no caching the persistent store might be the only actually mutable representation).

The test consists of the following:

  • A simple object model of trucks that contain pallets that contain boxes that contain items is created. The four-level hierarchy was chosen to exacerbate perhaps the biggest weakness (in terms of both performance and code complexity) of immutability: The need to reconstruct an entire sub-hierarchy whenever there is a permutation to a lower level. (I.e, adding an item to a box results in not just a copy of the box but of its pallet and the pallet's truck as well).
  • Initialization of the system with a cache holding 100 trucks each with 50 pallets each with 50 boxes each with 50 Items - every pallet and box is at half capacity (as defined arbitrarily by the system). This is a total of 12,500,000 items.
  • 220,000 concurrent operations are then performed on the system: 200,000 reads and 20,000 writes. The latter operation consists of adding an item which results in the cascade of object creation in the immutable case noted above. This read/write ratio is probably typical of many types of enterprise systems.
  • The simple elapsed time to perform these operations is traced.

On my 2013 MacBook Pro, 2.7 Ghz Quad Core with 16 Gb of RAM, the immutable version of the code is about 35% faster, running with a mean of 2.6s vs. 4.1s for the mutable version (the mean of three runs for each). VM max heap size is 4 Gb in both cases; all other VM settings are standard.

Source code size is almost identical - the immutable version is three lines shorter. Both samples use immutable Scala collections (as opposed to business objects) and both use, primarily, Java's ReentrantReadWriteLock for locking.

This is not intended to be an exhaustive performance analysis as certainly optimizations are possible in both cases. Both implementations are, however, typical, and the results demonstrate that far from being unviable from a performance standpoint, using immutable business objects in enterprise software is quite realistic.

Now that we have established that using immutable business objects can be highly performant, let's take a closer look at their advantages, and then of the tools that Scala, a modern programming language that marries the object-oriented and functional idioms, provides to support this style of programming.

Analysis

These advantages of immutability have long been recognized:

  • Performance:
    • Because locking on business object accessors and mutators is not necessary, there is typically far less critical section contention in an immutable object system. Yes, locking is necessary at the cache level, but for writes only, and there are typically far fewer such operations than there are operations on business objects themselves.
    • Immutable objects can make very wide use of caching of computed values (in Scala, this generally means lazy vals). The benefits thereof will vary widely by domain but it is easy to see that this alone could result in orders of magnitude better performance in some examples. Analogous caching that might be possible with mutable objects at the least will generally require much more complex code - the type of code that is fertile ground for defects.
    • Object "snapshots" are entirely unnecessary since objects do not change state. In systems that cache large numbers of complex objects and have operations that require their remaining static for some computation, this can result in performance deltas of several orders of magnitude (especially when the garbage collection bill is priced-in).
  • Code Clarity/Correctness:
    • Object invariants need only be evaluated once, at construction time. In general, input validation and correctness checks are fewer and simpler; as it is often stated, "immutable objects are easier to reason about." This is because there are simply far fewer permutations of state possible.
    • Statelessness also means that temporal coupling is easier to avoid.

Let's enumerate the purported disadvantages of immutability:

  • The obvious one is the proliferation of object copying. At first blush, to the typical developer reared in the imperative world, the notion that you have to make an entirely new object every time you change any of its attributes in the slightest way seems about as sensible or palatable as, well - pick your favorite unpleasant analogy.
  • Furthermore, as noted above, things are even worse than they may seem at first, since a change to an object at level n in an ownership hierarchy requires copying of every level above it.  
  • Immutability is antithetical to the spirit of OO, since real-life objects do change (I've seen this objection in print).

To the last point above, the reply is that we needn't be concerned with perfect conceptual reality in the code.  Immutable objects do change - they simply do so by copying themselves. That this does not "reflect reality" is of no practical concern.

Concerning the copying inherent in immutable business object systems, it has already been demonstrated that any negative performance implications are typically far outweighed by concomitant advantages.  

That leaves potential disadvantages in code complexity and size.  Fortunately, in a modern language such as Scala built with the advantages of immutability in mind, there are really no such disadvantages. Scala case classes auto-generate a copy method that facilitates effortless and efficient copying of objects when a single value changes:

  case class Example(a: String, b: Int, c: Double)
  val one = Example("", 0, 1.0)
  val two = one.copy(b = 2)

The code is not only terse but high-performance: All unchanging parts of the object are reused in the new version (clearly this is important for non-primitive values). Thus, the amount of actual computational work is much less than might be assumed.

Scala's immutable collections are similarly highly efficient.

Listing One - Common.scala

package com.folbrecht

import java.util.concurrent.{TimeUnit, Executors}
import java.util.concurrent.locks.Lock

import scala.concurrent.duration._
import scala.concurrent.{ExecutionContext, Await, Future}
import scala.util.Random

trait Common {
  protected trait Cache[K, V] {
    protected var cache: Map[K, V] = Map.empty

    def size: Int = cache.size

    def values: Iterable[V] = cache.values

    def get(id: K): V = cache.getOrElse(id, throw new IllegalArgumentException(s"Bad id: $id"))

    def set(id: K, that: V): Unit = synchronized(cache += id -> that)
  }

  protected trait Factory {
    def makeBox(id: Int, items: Seq[Item]): Box

    def makePallet(id: Int, boxes: Map[Int, Box]): Pallet

    def makeTruck(id: Int, pallets: Map[Int, Pallet]): Truck
  }

  protected var cache: Cache[Int, Truck]
  protected val factory: Factory
  implicit private val context = ExecutionContext.fromExecutor(Executors.newFixedThreadPool(10))

  protected case class Item(desc: String)

  protected trait Box {
    def id: Int

    def items: Seq[Item]

    def size: Int

    def hasRoom: Boolean

    def add(item: String): Box
  }

  protected trait Pallet {
    def id: Int

    def size: Int

    def boxes: Map[Int, Box]

    def hasRoom: Boolean

    def add(item: String): Pallet
  }

  protected trait Truck {
    def id: Int

    def pallets: Map[Int, Pallet]

    def full: Boolean

    def palletWithRoom: Option[Pallet]

    def size: Int

    def add(item: String): Truck
  }

  protected def withLock[A](lock: Lock)(block: => A): A = {
    if (!lock.tryLock(30, TimeUnit.SECONDS)) throw new IllegalStateException("Could not obtain lock.")
    try block
    finally lock.unlock()
  }

  protected def anyTruckWithRoom: Truck = {
    var truck = anyTruck
    while (truck.full) truck = anyTruck
    truck
  }

  protected def anyTruck: Truck = cache.get(Random.nextInt(cache.size))

  protected def run(): Unit = {
    init()
    println(s"Item count: ${cache.values.toSeq.map(_.size).sum}")

    val time = System.currentTimeMillis()
    val reads = (0 until 200000).map(i => Future(anyTruckWithRoom.palletWithRoom.map(_.size)))
    val writes = (0 until 20000).map(i => Future(anyTruckWithRoom.add(i.toString)))
    val future = Future.sequence(reads ++ writes)
    Await.result(future, 300.seconds)
    println(s"Time is ${(System.currentTimeMillis().toDouble - time) / 1000}s")
    println(s"Item count: ${cache.values.toSeq.map(_.size).sum}")
    System.exit(1)
  }

  protected def init(): Unit = {
    for (i <- 0 until 100) {
      val pallets: Map[Int, Pallet] = (0 until 50).map { i =>
        val boxes: Map[Int, Box] = (0 until 50).map { i =>
          val items = (0 until 50).map(i => Item(i.toString))
          (i, factory.makeBox(i, items))
        }.toMap
        (i, factory.makePallet(i, boxes))
      }.toMap
      cache.set(i, factory.makeTruck(i, pallets))
      println(s"Added truck $i with ${pallets.size} pallets")
    }
  }
}


Listing Two - MutablePerformanceTest.scala

package com.folbrecht

import java.util.concurrent.locks.ReentrantReadWriteLock

object MutablePerformanceTest extends App with Common {

  protected class CacheImpl[K, V] extends Cache[K, V]

  protected class FactoryImpl extends Factory {
    def makeBox(id: Int, items: Seq[Item]): Box = BoxImpl(id, items)

    def makePallet(id: Int, boxes: Map[Int, Box]): Pallet = PalletImpl(id, boxes)

    def makeTruck(id: Int, pallets: Map[Int, Pallet]): Truck = TruckImpl(id, pallets)
  }

  protected var cache: Cache[Int, Truck] = new CacheImpl[Int, Truck]()
  protected val factory: Factory = new FactoryImpl

  case class BoxImpl(id: Int, var items: Seq[Item]) extends Box {
    private val lock = new ReentrantReadWriteLock

    def hasRoom: Boolean = withLock(lock.readLock())(items.size < 100)

    def size: Int = withLock(lock.readLock())(items.size)

    def add(item: String): Box = withLock(lock.writeLock()) {
      items = items :+ Item(item)
      this
    }
  }

  case class PalletImpl(id: Int, var boxes: Map[Int, Box]) extends Pallet {
    private val lock = new ReentrantReadWriteLock

    def hasRoom: Boolean = withLock(lock.readLock())(boxes.values.exists(_.hasRoom))

    def size: Int = withLock(lock.readLock())(boxes.values.toSeq.map(_.size).sum)

    def add(item: String): Pallet = withLock(lock.writeLock()) {
      boxes.values.find(_.hasRoom) match {
        case None => throw new RuntimeException("We're full")
        case Some(box) =>
          boxes += box.id -> box.add(item)
          this
      }
    }
  }

  case class TruckImpl(id: Int, pallets: Map[Int, Pallet]) extends Truck {
    private val lock = new ReentrantReadWriteLock

    def full: Boolean = withLock(lock.readLock())(palletWithRoom.isEmpty)

    def palletWithRoom: Option[Pallet] = withLock(lock.readLock())(pallets.values.find(_.hasRoom))

    def size: Int = withLock(lock.readLock())(pallets.values.toSeq.map(_.size).sum)

    def add(item: String): Truck = synchronized {
      palletWithRoom match {
        case None => throw new RuntimeException("Truck's full")
        case Some(pallet) =>
          //println(s"Adding item $item to truck ${truck.id} on pallet ${pallet.id} on thread ${Thread.currentThread.getName}")
          pallet.add(item)
          this
      }
    }
  }

  run()
}


Listing Three - Immutable PerformanceTests.scala

package com.folbrecht

import java.util.concurrent.locks.ReentrantReadWriteLock

object ImmutablePerformanceTest extends App with Common {

  protected class CacheImpl[K, V] extends Cache[K, V] {
    private var locks: Map[K, ReentrantReadWriteLock] = Map()

    def update(id: K)(block: V => V): V = withLock(lockFor(id).writeLock()) {
      val newValue = block(get(id))
      set(id, newValue)
      newValue
    }

    private def lockFor(id: K): ReentrantReadWriteLock = synchronized {
      locks.getOrElse(id, {
        locks += id -> new ReentrantReadWriteLock
        locks(id)
      })
    }
  }

  protected class FactoryImpl extends Factory {
    def makeBox(id: Int, items: Seq[Item]): Box = BoxImpl(id, items)

    def makePallet(id: Int, boxes: Map[Int, Box]): Pallet = PalletImpl(id, boxes)

    def makeTruck(id: Int, pallets: Map[Int, Pallet]): Truck = TruckImpl(id, pallets)
  }

  protected var cache: Cache[Int, Truck] = new CacheImpl[Int, Truck]()
  protected val factory = new FactoryImpl

  case class BoxImpl(id: Int, items: Seq[Item]) extends Box {
    lazy val hasRoom: Boolean = items.size < 100
    lazy val size = items.size

    def add(item: String): Box = copy(items = items :+ Item(item))
  }

  case class PalletImpl(id: Int, boxes: Map[Int, Box]) extends Pallet {
    lazy val hasRoom: Boolean = boxes.values.exists(_.hasRoom)
    lazy val size: Int = boxes.values.toSeq.map(_.size).sum

    def add(item: String): Pallet = boxes.values.find(_.hasRoom) match {
      case None => throw new RuntimeException("We're full")
      case Some(box) => copy(boxes = boxes + (box.id -> box.add(item)))
    }
  }

  case class TruckImpl(id: Int, pallets: Map[Int, Pallet]) extends Truck {
    lazy val full: Boolean = palletWithRoom.isEmpty
    lazy val palletWithRoom: Option[Pallet] = pallets.values.find(_.hasRoom)
    lazy val size: Int = pallets.values.toSeq.map(_.size).sum

    def add(item: String): Truck = cache.asInstanceOf[CacheImpl[Int, Truck]].update(id) { truck =>
      truck.palletWithRoom match {
        case None => throw new RuntimeException("Truck's full")
        case Some(pallet) =>
          val newPallet = pallet.add(item)
          //println(s"Adding item $item to truck ${truck.id} on pallet ${pallet.id} on thread ${Thread.currentThread.getName}")
          copy(pallets = truck.pallets + (pallet.id -> newPallet))
      }
    }
  }

  run()
}



Find scaling and performance issues before your customers do with our Introduction to High-Capacity Load Testing guide.

Topics:
scala ,perforance ,functional programming ,java

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}