ShuntingYard algorithm in Scala

Last week I came across an interesting "coding kata" by Brett Schuchert on the Object Mentor blog. The trick of a kata is that you grow the program step-by-step using tests, just like a kata in karate is tought to a student. The problem of this kata was the Shunting Yard algorithm of Dijkstra. I wanted to see if I could implement this kata in Scala.

Instead of writing the algorithm as described, the trick in a coding kata is that you first write a test, and then make the test pass. And subsequently each time adding a test and keeping it all in the green. Brett described the test-cases in his blog entry.

The Shunting Yard algorithm is used to convert a mathematical function in infix notation to a reverse polish or postfix notation. For instance, it can convert an expression like 3+4 to 3 4 +.

In order to implement this algorithm, one needs to do string parsing to break up the infix string. Scala has the parser combinators that can do just this. Using a parser I build an Abstract Syntax Tree (AST) which is a representation of the formula in a tree form. An AST for 3 + 4 and (3 + 4) * 4 looks like:
ast

The AST representation of the formula can then easily be printed in a reverse polish notation, as the code shows. The final Parser looks like this:

import scala.util.parsing.combinator.syntactical._

abstract class Expr {
  def rpn:String
}
case class BinaryOperator(lhs:Expr, op:String, rhs:Expr) extends Expr {
	def rpn:String = lhs.rpn + " " + rhs.rpn + " " + op
}
case class Number(v:String) extends Expr { def rpn:String = v }
case class Variable(v:String) extends Expr { def rpn:String = v }
case class Function(f:String, e:List[Expr]) extends Expr { def rpn:String = {
	var s = ""
	e.foreach { x => s += x.rpn + " " }
	s += f
	return s
  }
}
object ShuntingYard extends StandardTokenParsers {
    lexical.delimiters ++= List("+","-","*","/", "^","(",")",",")

    def value :Parser[Expr] = numericLit ^^ { s => Number(s) }
    def variable:Parser[Expr] =  ident ^^ { s => Variable(s) }
    def parens:Parser[Expr] = "(" ~> expr <~ ")"
    
    def argument:Parser[Expr] = expr <~ (","?)
    def func:Parser[Expr] = ( ident ~ "(" ~ (argument+) ~ ")" ^^ { case f ~ _ ~ e ~ _ => Function(f, e) })

    def term = (value | parens | func | variable)

    // Needed to define recursive because ^ is right-associative
    def pow :Parser[Expr] = ( term ~ "^" ~ pow ^^ {case left ~ _ ~ right => BinaryOperator(left, "^", right) }|
    			term)
    def factor = pow * ("*" ^^^ { (left:Expr, right:Expr) => BinaryOperator(left, "*", right) } |
                        "/" ^^^ { (left:Expr, right:Expr) => BinaryOperator(left, "/", right) } )
    def sum =  factor * ("+" ^^^ { (left:Expr, right:Expr) => BinaryOperator(left, "+", right) } |
    					"-" ^^^ { (left:Expr, right:Expr) => BinaryOperator(left, "-", right) } )
    def expr = ( sum | term )

    def parse(s:String) = {
        val tokens = new lexical.Scanner(s)
        phrase(expr)(tokens)
    }
    
    def shunt(exprstr: String) : String = exprstr match {
      case null => return ""
      case "" => return ""
      case _ =>
    	parse(exprstr) match {
            case Success(tree, _) =>
                println("Tree: "+tree)
                val v = tree.rpn
                println("RPN: "+v)
                return v
            case e: NoSuccess => Console.err.println(e)
            	return e.toString
        }
    }
}

Of course I tested this using the following unit test, of which I will show only a small part, one can easily complete this by looking at Brett's examples:

import org.junit.Test
import org.junit.Assert._

class ShuntingYardTest {
  @Test
  def test_11() {
    assertEquals("4 g f", ShuntingYard.shunt("f(g(4))"))
  }
  
  @Test
  def test_12() {
    assertEquals("3 4 19 f", ShuntingYard.shunt("f(3, 4, 19)"))
  }
  
  @Test
  def test_13() {
    assertEquals("3 4 2 * 1 5 - 2 3 ^ ^ / +", ShuntingYard.shunt("3 + 4 * 2 / ( 1 - 5 ) ^ 2 ^ 3"))
  }
  
  @Test
  def test_14 () {
    assertEquals("4 5 + 1 a 2 ^ + 8 b + 10 * f", ShuntingYard.shunt("f(4+5,1+a^2,(8+b)*10)"))
  }
}

Of all the tests that Brett defined, only one doesn't pass, can you spot which one? Of course as this is one of my first exercises in Scala, it probably isn't optimal yet, any improvements are welcome in the comments!

Comments (5)

  1. Mahsa - Reply

    June 15, 2015 at 12:28 am

    I am trying to test this code but I cannot compile it. I am having too many errors I was wondering if you can help me figure them out!

    • jvanerp@xebialabs.com - Reply

      June 25, 2015 at 11:56 am

      Commented below, clicked the wrong reply button:

      Hi Mahsa,

      If you're using Scala 2.11+ you should include the "org.scala-lang.modules:scala-parser-combinators_2.11:1.0.4" jar at least. This code was written way before that (2009). If you could send the problems, or the project you're working on, I might be able to help you out.

      - Jeroen

  2. Rubic - Reply

    June 24, 2015 at 9:42 pm

    Great Job, I was wondering how would the code look like to perform the calculation after making the postfix expression?

  3. jvanerp@xebialabs.com - Reply

    June 25, 2015 at 9:30 am

    Hi Mahsa,

    If you're using Scala 2.11+ you should include the "org.scala-lang.modules:scala-parser-combinators_2.11:1.0.4" jar at least. This code was written way before that (2009). If you could send the problems, or the project you're working on, I might be able to help you out.

    - Jeroen

  4. Mahsa - Reply

    June 25, 2015 at 5:26 pm

    Hi Jeroen,
    I finally was able to compile it using some changes. I am totally new in Scala thanks for helping me. in my program user will send an address to the program which I am reading it as a variable and I should create a RDD corresponding to that and make it run. What I am doing is that checking the input in my parser for syntax then with the help of your code I am turning the infix to postfix, in the Variable case class I am reading my xml file like this : " def rpn:String = "hdfs://172.30.50.2:8020/yzyan/xml/test3D.xml" " Note that I might have several inputs like this for instance the input can be :" ([path] +1) * [path] " the program will replace the paths with corresponding xml file I jut simplify it assuming I only have one xml, then I want to have an RDD type in the Variable segment, and do a calculation.

    /*** this trait is for identifying roots like : /hdfs://xxx.xx.xx.x:xxxx/path/file.a3d ***/
    trait pathIdentifier extends RegexParsers{

    def pathIdent: Parser[String] ={
    """/hdfs://([\d\.]+):(\d+)/([\w/]+/(\w+\.\w+))""".r
    }
    }

    class ParseExp extends JavaTokenParsers with pathIdentifier {

    def exprs: Parser[Any] = terms~rep("+"~terms | "-"~terms)
    def terms: Parser[Any] = factors~rep("*"~factors | "/"~factors)
    def factors: Parser[Any] = pathIdent | floatingPointNumber | "("~exprs~")"

    val delimiters = Set('+','-','*','/', '**','(',')',',')
    def delimiter: Parser[Char] = acceptIf(delimiters.contains)(e => "expected delimiter but was "+e)

    the rest of it right now is your code and I run it and get the RPN successfully , now I want to perform calculations on that, make it simple suppose that I have variables with any name for instance a+1 inside the program it knows if the cariable is a it will be equal to a number how would the function look like for the calculations?
    Thanks

Add a Comment