Over a million developers have joined DZone.

Normalizing a Date String in the Scala Way

DZone's Guide to

Normalizing a Date String in the Scala Way

How to use functional programming to efficiently normalize the formatting of a date string in Scala.

· Java Zone
Free Resource

Just released, a free O’Reilly book on Reactive Microsystems: The Evolution of Microservices at Scale. Brought to you in partnership with Lightbend.

I have a problem that is a bit challenge to solve it efficiently in a functional way. Scala is nice because it allows you to use an imperative style if you want. Some people may not agree with this, and that’s totally okay. I just love to have an alternative option when I need it. However, my code often end up having 100% immutable most of the time. So, please don’t judge me because I said that it is good to have options. :P

Let’s look at the problem and see how can we solve it.

We have a date string, and we don’t know what the format of this date. It can be in any format. So, to find it out, we have to try parsing a date string against each pattern in a predefined list. If the parse function doesn’t throw an exception (java.time.format.DateTimeParseException), we will get a date object back and then we format it using a desired date pattern.

I hope I didn’t confuse you. Let’s take a look at the function definition below:

  * Convert a date string from any formats to ISO8601 format
  * @param dateStr any non-null value 
  * @return date string in ISO8601
def normalizeDate(dateStr: String): Option[String]

Example: Let’s assume that the input is “January 23, 2012”. You have to convert this date string to ISO8601 format like “2012–01–23”. The input can be in any strings in any formats but the output must be either date string in ISO8601 format or nothing.

In order to do this, we need a list of date patterns that we use to test against the input string.

val dateFormats = List(
 “MMM dd, uuuu”,
 “dd MMMM uuuu”,
 “dd MMM uuuu”,
).map(p => (p, DateTimeFormatter.ofPattern(p)))

val iso8601DateFormatter = DateTimeFormatter.ISO_LOCAL_DATE

Hmmm…. after we have this list, what should we do next?

If you have a background in an imperative language, you may come up with a solution like this.

def normalizeDate(dateStr: String): Option[String] = {
  val trimmedDate = dateStr.trim
  if(trimmedDate.isEmpty) None
  else {
    for((pattern, fmt) <- dateFormats) {
      val dateTry = Try(fmt.parse(trimmedDate))
        return Some(iso8601DateFormatter.format(dateTry.get))

This function doesn’t try to parse the input string with all patterns we have. It stops parsing when the parse function does not throw an exception. There’s nothing wrong with this solution, and Scala lets you do it. It’s actually very good solution. It’s fast, and we don’t use a variable at all. But, for some reasons, we still feel like we did something wrong. It is like a sin to have a return keyword there that is used for breaking a loop. It’s not wrong, but it’s so imperative.

How can we improve it then? How can we rewrite it to make it more functional?

There should be a function that we can call. Right? Scala collection never lets us down. There’s always a function in Scala collection that we can use. Let’s find it out.

def find(p: (A) ⇒ Boolean): Option[A]

The find function looks promising. Let see what we can do with this function.

val (pattern, fmt) = dateFormats.find{t => Try(t._2.parse(trimmedDate)).isSuccess}

The find function returns the first element in a collection that matches the predicate. The problem is that the find function returns type A which is a tuple of String and DateTimeFormatter in this case. It doesn’t return a date object. If we want to get a date object, we have to parse the date again. It’s obviously not a good solution. (We can use var to store the date object and set it in an anonymous function. But that’s so imperative. It looks even worse than the first solution.) 

Then we start looking at other functions — reduce, fold, map, zip, … and the combination of those functions.

Can you do it in your own way before looking at my solution below?

You may come up with an even better solution, and I will really appreciate if you can share it.

Sharing is caring

Ok. Let’s see what I got:

def normalizeDate(dateStr: String): Option[String] = {
   val trimmedDate = dateStr.trim
      if(trimmedDate.isEmpty) None
      else {
        dateFormats.toStream.map { case (pattern, fmt) =>
          println(s"Pattern: $pattern")
        }.find(_.isSuccess).map{ t => 
          iso8601DateFormatter.format(t.get) }

The essential element of this function is toStream (You can achieve the same thing with view) . It turns Scala’s collection into a stream. The element will not be evaluated in the map function right away, but it will be evaluated in the last function of a pipeline which is the find function in this case. So, when the predicate is true, the find function returns a date object wrapped in a Try and leaves the rest of elements in a collection untouched (if there are any). Finally, we format the date object to ISO8601 format before returning this value to a caller.

To see how it works, let's assume that we have run this statement normalizeDate("12 January 2012") This is what you will see in the stdout.

Pattern: dd/MM/uuuu
Pattern: MMM dd, uuuu
Pattern: dd MMMM uuuu

Note that you will see only the first 3 patterns instead of 5, and the last pattern matches our input string. Imagine that you have over 1,000 patterns in a list, you can skip a lot of unnecessary computation.

However, if you do a benchmark, you will find that the imperative version win the race. But, that’s not the point here, right?
Image title

Hold on… I can’t accept this. We use Scala because we believe in the power. With great power comes great expressiveness. This is not the ending that I expected. There must be something.. something… you know?

Image title

Do not lose hope in the functional path. Period. 

Think back to Scala 101, what have learned in the very first day of functional programming. Yes, it is recursion. In this problem, it doesn’t matter whether it is tail recursion or not. The N is so small until we can ignore it. Anyway, we don’t have a reason not to use tail recursion when we can, do we? :D

Here you go, the even better solution that uses tail recursion:

def normalizeDate(dateStr: String): Option[String] = {
    val trimmedDate = dateStr.trim
    def normalize(patterns: List[(String, DateTimeFormatter)]): Try[TemporalAccessor] = patterns match {
      case head::tail => {
        val resultTry = Try(head._2.parse(trimmedDate))
        if(resultTry.isSuccess) resultTry else normalize(tail)
      case _ => Failure(new RuntimeException("no match found"))
    if(trimmedDate.isEmpty) None
    else {

Ok. What about the performance then? Is it better than the imperative version? Heck yeah! It’s slightly better. :D

Here’s the benchmark configuration:

val config = config(
  Key.exec.minWarmupRuns -> 100,
  Key.exec.maxWarmupRuns -> 500,
  Key.exec.benchRuns -> 10000
) withWarmer(new Warmer.Default)

And the results:

Tail recursion version:              0.0057  ms (Winner)
Imperative version:                   0.0072  ms
Non-strict collection version: 0.0213   ms

Thank you for reading.

If you are interested, you can check out my other articles about Scala.

Strategies and techniques for building scalable and resilient microservices to refactor a monolithic application step-by-step, a free O'Reilly book. Brought to you in partnership with Lightbend.

scala ,functional programming

Published at DZone with permission of Hussachai Puripunpinyo. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}