DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • Thermometer Continuation in Scala
  • How Stalactite ORM Implements Its Fluent DSL
  • Java vs. Scala: Comparative Analysis for Backend Development in Fintech
  • A Practical Guide to Temporal Workflow Design Patterns

Trending

  • Persistent Memory for AI Agents Using LangChain's Deep Agents
  • Metal Default, a New Build Cloud, and a New Format
  • Is the Data Warehouse Dead? 3 Patterns From Enterprise Architecture That Answer This Question
  • How to Interpret the Number of Spring ApplicationContexts in Integration Tests
  1. DZone
  2. Coding
  3. Languages
  4. Getting Started with Scala Parser Combinators

Getting Started with Scala Parser Combinators

How to get started using the parser combinator library in Scala, which can be used to make your own programming language.

By 
Travis Dazell user avatar
Travis Dazell
·
Jul. 29, 15 · Tutorial
Likes (3)
Comment
Save
Tweet
Share
17.3K Views

Join the DZone community and get the full member experience.

Join For Free

Scala provides a very easy way to design your own programming language, using its parser library. This makes creating your own domain specific language (i.e. DSL) or interpreted language easier than you could ever imagine. As a primer, let's write a parser that parses simple mathematical expressions, such as "1+9*8" and "4*6/2-5".

For those of you who are familiar with language design, the EBNF grammar for this language would look something like this:


digit ::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"
number ::= digit | digit number
operator ::= "+" | "-" | "*" | "/"
expr ::= number (operator expr)?


To start writing a parser with the Scala parsing library, we write a class that extends the Parsers trait. Here's an example of a class that extends RegexParsers, which is a subtrait of Parsers.


class ExprParser extends RegexParsers {
    val number = "[1-9][0-9]+".r

    def expr: Parser[Any] = number ~ opt(operator ~ expr )

    def operator: Parser[Any] = "+" | "-" | "*" | "/"
}


The only differences between the Scala definition of the valid tokens and the definition within the EBNF grammar are the following:

  • Scala utilizes a "~" between each token
  • Instead of using a "?" like you would in an EBNF grammar, Scala uses the keyword "opt"

To execute our parser, we simply invoke the inherited parse method that's part of the Parsers trait.


def main(args : Array[String]) {

    val parser = new ExprParser

    val result = parser.parseAll(parser.expr, "9*8+21/7")

    println(result.get)
}


The result of this println will be:

(9~Some((*~(8~Some((+~(21~Some((/~(7~None))))))))))


We're done! Well, not really. The output right now is the way that Scala sees the result of our parser operations. To make our language more meaningful, let's add some Scala code to compute the arithmetic operation and print the result to the output.


Let's begin our quest to calculate the result by examining what "(9~Some((*~(8~Some((+~(21~Some((/~(7~None))))))))))" really means in the world of Scala. Let's look at a subset of this string, "(9~Some((*~(8~None))))". This is the result of parsing "9*8". The first part that looks interesting is the "9~Some(...)". In our parser, we defined the following rule:


def expr: Parser[Any] = number ~ opt(operator ~ expr)



It's clear that "number" is evaluating to "9" and "~" is being printed out verbatim, which you should recall is used in Scala parsers to join parts of the grammar. However, what's going on with "Some(...)"? Well, whenever Scala parses an opt(x) statement, it will evaluate it as either Some(...) or None, both of which are subclasses of Option. That makes sense... the opt(x) statement evaluates to an Option.

Instead of having our parser return a bunch of ~ and options, let's look at transforming the parser results into something more useful. Again, looking at our current parser rule:


def expr: Parser[Any] = number ~ opt(operator ~ expr)



We need to modify this parser definition to have it return an Int instead of Any. We also need to compute the result of the arithmetic operation. Our grammar rule allows for either a single number or a number followed by an arithmetic operator and another number. If we're dealing with a single number, we need to tell the parser to convert the result to an Int. To do this, we make the following modification to our parser rule:


def expr: Parser[Int] = (number ^^ { _.toInt }) { }


The ^^ just tells the parser to execute the code that follows it, contained in {...}. All we're doing is converting it to an Int.

Next, we need to tell the parser what to do when it encounters a number, or when it encounters a number followed by an operator and another number. For this, we need to define the integer operation for each situation (single integer value, addition of two values, subtraction of two values, division of two values, and multiplication of two values).


def expr: Parser[Int] = (number ^^ { _.toInt }) ~ opt(operator ~ expr ) ^^ {
    case a ~ None => a
    case a ~ Some("*" ~ b) => a * b
    case a ~ Some("/" ~ b) => a / b
    case a ~ Some("+" ~ b) => a + b
    case a ~ Some("-" ~ b) => a - b
}


There are five cases we're handling. The first is the situation where we have just a single integer (a ~ None). When we have an Int with None after it, we simply evaluate the integer value as-is. The second situation is when we have an integer being multiplied by another integer (a ~ Some("*" ~ b)). In this case, we simply perform a * b. We then proceed to define the rules for division, addition, and subtraction.


The key take-aways from this tutorial are:

  • You define the type that your parser rule is returning within the brackets of the Parser[ ] definition. In this example, it's an Int.
  • You can add custom Scala code to operate on the parser results with ^^ { ... }


Now that we've laid the groundwork for Scala parser combinators, we can build on these features to create a full-featured interpreted language that contains if-else conditions, loops, and even function calls.

Here's an article on how to create a full-featured interpreted language with this approach: https://dzone.com/articles/create-a-programming-language-with-scala-parser-co

Parser (programming language) Scala (programming language) Data Types Domain-Specific Language

Opinions expressed by DZone contributors are their own.

Related

  • Thermometer Continuation in Scala
  • How Stalactite ORM Implements Its Fluent DSL
  • Java vs. Scala: Comparative Analysis for Backend Development in Fintech
  • A Practical Guide to Temporal Workflow Design Patterns

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook