Compiler Frontend Phases

In this case study we collected multiple case studies useful and related to compiler frontend phases. Namely, we showcase a pull-based lexer, a parser with backtracking behavior, a pretty printer and an ANF transformer.


Pull-based Lexing

In this case study, we show how to implement a pull-based lexer in terms of effect handlers.

import regex

Tokens and Positions

First we define the datatypes to represent lexemes (tokens) and positions in the input stream:

record Position(line: Int, col: Int, index: Int)

type TokenKind { Number(); Ident(); Punct(); Space() }

def show(t: TokenKind): String = t match {
  case Number() => "number"
  case Ident()  => "identifier"
  case Punct()  => "punctuation"
  case Space()  => "space"
}

def infixEq(t1: TokenKind, t2: TokenKind): Bool =
  (t1, t2) match {
    case (Number(), Number()) => true
    case (Ident(), Ident()) => true
    case (Punct(), Punct()) => true
    case (Space(), Space()) => true
    case _ => false
  }

record Token(kind: TokenKind, text: String, position: Position)

def show(t: Token): String = t.kind.show

Tokens simply are tagged with a token type (distinguishing numbers, identifiers, and punctuation), the original text of the token and its position.

The Lexer Effect

Next, we define the interface to the lexer as an effect signature.

interface Lexer {
  def peek(): Option[Token]
  def next(): Token
}

it consists of two effect operations, one to inspect the next token without consuming it (peek) and one operation to advance in the stream of tokens. This describes the interface of a pull-based lexer as a stream of tokens. Lexemes are only processed on demand. An example program using the lexer effect is:

def example1() = {
  val t1 = do next();
  val t2 = do next();
  val t3 = do next();
  (t1, t2, t3)
}

Handling the Lexer Effect with a given List

A dummy lexer reading lexemes from a given list can be implemented as a handler for the Lexer effect. The definition uses the effect LexerError to signal the end of the input stream:

effect LexerError(msg: String, pos: Position): Nothing
val dummyPosition = Position(0, 0, 0)

def lexerFromList[R](l: List[Token]) { program: => R / Lexer }: R / LexerError = {
  var in = l;
  try { program() } with Lexer {
    def peek() = in match {
      case Nil() => resume(None())
      case Cons(tok, _) => resume(Some(tok))
    }
    def next() = in match {
      case Nil() => do LexerError("Unexpected end of input", dummyPosition)
      case Cons(tok, _) => resume(tok)
    }
  }
}

We define a separate handler to report lexer errors to the console:

def report { prog: => Unit / LexerError }: Unit =
  try { prog() } with LexerError { (msg, pos) =>
    println(pos.line.show ++ ":" ++ pos.col.show ++ " " ++ msg)
  }

Given a list of example tokens

val exampleTokens = [
  Token(Ident(), "foo", dummyPosition),
  Token(Punct(), "(", dummyPosition),
  Token(Punct(), ")", dummyPosition)
]

we can compose the two handlers to run our example consumer:

report {
  exampleTokens.lexerFromList {
    inspect(example1())
  }
}

Handling the Lexer Effect by Processing a String

Of course, we can also implement a handler for our Lexer effect that actually processes an input and computes the tokens contained therein.

This time, we use a number of different regular expressions to recognize lexemes. First, we define the different token types as a list of pairs of regular expressions and token types.

record TokenRx(kind: TokenKind, rx: Regex)

val tokenDesriptors = [
  TokenRx(Number(), "^[0-9]+".regex),
  TokenRx(Ident(),  "^[a-zA-Z]+".regex),
  TokenRx(Punct(),  "^[=,.()\\[\\]{}:]".regex),
  TokenRx(Space(),  "^[ \t\n]+".regex)
]

Now, we can define the function lexer which receives a string as input as well as a computation prog for which it handles the Lexer effect:

def lexer[R](in: String) { prog: => R / Lexer } : R / LexerError = {
  // Additionally, we keep track of the current position in the input stream, by maintaining
  // three mutable variables for the zero based index, and one-based column and line position.
  var index = 0
  var col = 1
  var line = 1
  // A few local helper functions ease the handling of the input stream.
  // At the same time, we need to keep track of the line information.
  def position() = Position(line, col, index)
  def input() = in.substring(index)
  def consume(text: String): Unit = {
    with ignore[MissingValue]
    val lines = text.split("\n")
    val offset = lines.last.length
    // compute new positions
    index = index + text.length
    line = line + lines.size - 1
    if (lines.size == 1) { col = col + text.length } else { col = offset }
  }
  def eos(): Bool = index >= in.length
  // The function `tryMatch` applies a given token description to the current position of
  // the input stream, without advancing it. Its companion `tryMatchAll` returns the first token
  // matched by any of the matches in the given description list.
  def tryMatch(desc: TokenRx): Option[Token] =
      desc.rx.exec(input()).map { m => Token(desc.kind, m.matched, position()) }

  def tryMatchAll(descs: List[TokenRx]): Option[Token] = descs match {
    case Nil() => None()
    case Cons(desc, descs) => tryMatch(desc).orElse { tryMatchAll(descs) }
  }
  // Now defining the lexer is trivial. We just need to use `tryMatchAll` and either consume
  // the input, or not.
  try { prog() } with Lexer {
    def peek() = resume(tryMatchAll(tokenDesriptors))
    def next() =
      if (eos())
        do LexerError("Unexpected EOS", position())
      else {
        val tok = tryMatchAll(tokenDesriptors).getOrElse {
          do LexerError("Cannot tokenize input", position())
        }
        consume(tok.text)
        resume(tok)
      }
  }
}

Running our above consumer with the string "foo()":

  report {
    lexer("foo()") {
      inspect(example1())
    }
  }

Whitespace Skipping

Interestingly, a whitespace skipping lexer can be implemented as a effect transformer. That is, a handler that (partially) re-raises effect operations.

def skipSpaces(): Unit / Lexer = do peek() match {
  case None() => ()
  case Some(Token(Space(), _, _)) => do next(); skipSpaces()
  case _ => ()
}

def skipWhitespace[R] { prog: => R / Lexer }: R / Lexer =
  try { prog() } with Lexer {
    def peek() = { skipSpaces(); resume(do peek()) }
    def next() = { skipSpaces(); resume(do next()) }
  }

The handler skipWhitespace simply skips all spaces by using the Lexer effect itself.

  report {
    lexer("foo (   \n  )") {
      skipWhitespace {
        inspect(example1())
      }
    }
  }

Parsing

In this case study, we show how to implement a parser, using the lexer from the Lexer case study.

Parsers can be expressed by using the lexer effect and process the token stream. To model different alternatives in the grammar, we use the following effect for non-determinism:

interface Nondet {
  def alt(): Bool
  def fail(msg: String): Nothing
}

effect Parser = { Nondet, Lexer }

Parser Combinators

Given these two effects, we can readily define a host of (imperative) parser combinators. We start by the simplest one, which applies a predicate to the next element in the input stream and fails, if it does not match.

def accept { p: Token => Bool } : Token / Parser = {
  val got = do next();
  if (p(got)) got
  else do fail("Unexpected token " ++ got.show)
}

Using accept, we can define parsers for the different token types.

def any() = accept { t => true }
def accept(exp: TokenKind) = accept { t => t.kind == exp }
def ident() = accept(Ident()).text
def number() = accept(Number()).text
def punct(p: String) = {
  val tok = accept(Punct())
  if (tok.text == p) ()
  else do fail("Expected " ++ p ++ " but got " ++ tok.text)
}
def kw(exp: String): Unit / Parser = {
  val got = ident();
  if (got == exp) ()
  else do fail("Expected keyword " ++ exp ++ " but got " ++ got)
}

Using the effect for non-deterministic choice alt, we can model alternatives, optional matches and various repetitions:

def or[R] { p: => R } { q: => R } =
  if (do alt()) { p() } else { q() }

def opt[R] { p: => R }: Option[R] / Parser =
  or { Some(p()) } { None() }

def many { p: => Unit }: Unit / Parser =
  or { some { p() } } { () }

def some { p: => Unit }: Unit / Parser =
  { p(); many { p() } }

Example: A Simple Expression Language

We illustrate the usage of the above parser combinators by parsing a simple expressions language.

type Tree {
  Lit(value: Int)
  Var(name: String)
  Let(name: String, binding: Tree, body: Tree)
  App(name: String, arg: Tree)
}

Let us start by defining the parser for numeric literals.

def parseNum(): Tree / Parser = {
  val numText = number();
  with default[WrongFormat, Tree] { do fail("Expected number, but cannot convert input to integer: " ++ numText) };
  Lit(numText.toInt)
}

We simply call the parser for number() and try to convert the resulting string to an intenger.

def parseVar(): Tree / Parser =
  Var(ident())

def parseAtom() = or { parseVar() } { parseNum() }

Parsing variables is simply a matter of reusing the ident parser and changing the result. Users of monadic parser combinator libraries might be delighted to see, that we do not need to use map or something similar to transform the result. The parser is simply written in direct style, still offering great flexibility in modifying the semantics of effects.

Similarly, we can write parsers for let bindings, by sequentially composing our existing parsers:

def parseLet(): Tree / Parser = {
  kw("let");
  val name = ident();
  punct("=");
  val binding = parseExpr();
  kw("in");
  val body = parseExpr();
  Let(name, binding, body)
}

def parseGroup() = or { parseAtom() } {
  punct("(");
  val res = parseExpr();
  punct(")");
  res
}

def parseApp(): Tree / Parser = {
  val funName = ident();
  punct("(");
  val arg = parseExpr();
  punct(")");
  App(funName, arg)
}

def parseExpr(): Tree / Parser =
  or { parseLet() } { or { parseApp() } { parseGroup() } }

Again, note how naturally the result can be composed from the individual results, much like manually writing a recursive descent parser. Compared to handcrafted parsers, the imperative parser combinators presented here offer a similar flexibility. At the same time, the semantics of alt and fail is still left open, offering flexibility in the implementation of the actual underlying parsing algorithm.

Example: Combining Parsers and Local Mutable State

It is possible to combine the imperative parser combinators with local mutable state. The implementation of local variables in Effekt is designed to interact well with continuation capture and multiple resumptions. This is important for use cases like the parser example where the continuation is potentially called multiple times.

The following example implements an example

<EXPR> ::= <NUMBER> | <IDENT> `(` <EXPR> (`,` <EXPR>)*  `)`

It uses local (mutable) variables to count the number of leafs as semantic action.

def parseCalls(): Int / Parser =
  or { number(); 1 } {
    var count = 1;
    ident();
    punct("(");
    count = count + parseCalls();
    many {
        punct(",");
        count = count + parseCalls()
    };
    punct(")");
    count
  }

Notice how the user defined combinator many feels like a built-in control operator while.

Backtracking Parsers

As mentioned above, so far we used the non-determinism effects, but did not specify what they mean. We could implement depth-first parsing corresponding to recursive descent. Alternatively, we also could implement breadth-first parsing. Further, in case of ambiguities, we have the choice whether we want to recognize only the first result or compute all possible alternative ways to recognize the input.

For this case study, we implement a depth-first parser, computing the first successful result. Results are represented by the ParseResult datatype. The parsing algorithm is simply implemented as a handler for Parser.

type ParseResult[R] {
  ParseSuccess(t: R);
  ParseFailure(msg: String)
}

def parse[R](input: String) { p: => R / Parser }: ParseResult[R] = try {
  lexer(input) { skipWhitespace { ParseSuccess(p()) } }
} with Nondet {
  def alt() = resume(true) match {
    case ParseFailure(msg) => resume(false)
    case ParseSuccess(res) => ParseSuccess(res)
  }
  def fail(msg) = ParseFailure(msg)
} with LexerError { (msg, pos) =>
  ParseFailure(msg)
}

The handler reuses the lexer implementation of the Lexer case study. The lexer raises a LexerError in case of an unexpected enf of the input stream or if it cannot recognize a token. Those lexer errors are simply converted into failures, which the parser can backtrack. To establish the backtracking behavior, it is important that the lexer is executed under the parser handler and not the other way around. Only this way the lexer positions will be restored when calling the continuation a second time with resume(false).

Running the Examples

Having implemented a handler for the Parser effect, we can run our example “grammars” on some inputs.

inspect(parse("42") { parseCalls() })
inspect(parse("foo(1)") { parseCalls() })
inspect(parse("foo(1, 2)") { parseCalls() })
inspect(parse("foo(1, 2, 3, 4)") { parseCalls() })
inspect(parse("foo(1, 2, bar(4, 5))") { parseCalls() })
inspect(parse("foo(1, 2,\nbar(4, 5))") { parseCalls() })

inspect(parse("}42") { parseExpr() })
inspect(parse("42") { parseExpr() })
inspect(parse("let x = 4 in 42") { parseExpr() })
inspect(parse("let x = let y = 2 in 1 in 42") { parseExpr() })
inspect(parse("let x = (let y = 2 in 1) in 42") { parseExpr() })
inspect(parse("let x = (let y = f(42) in 1) in 42") { parseExpr() })
inspect(parse("let x = (let y = f(let z = 1 in z) in 1) in 42") { parseExpr() })

Pretty Printer

In this case study, we implement a (naive) pretty printer library that performs backtracking to find the first layout that matches the screen width. We build on the work by

Linear, bounded, functional pretty-printing S. Doaitse Swierstra and Olaf Chitil. JFP 2009

but adapt it to the setting of effect handlers. Furthermore, the library presented here is neither linear (it uses simple backtracking), bounded (backtracking is arbitrary), nor functional (combinators are imperative style).

Similar to the parser case study, the pretty printing library is based on a non-deterministic description of layouts. In particular, to fill the horizontal space as best as possible, we first try to pretty print horizontally and fall back to vertical mode, if not enough space is available. For example, the following text overflows when trying to layout horizontally:

hello world

Treating the space as a potential line break, we can try again, this time vertically.

hello
world

Pretty printing also often includes ways to group parts of a document. If a group does not fit horizontally, all of its elements will be layouted vertically:

hello (this (is too) long)

Even though the first two words of the group would fit, the document will be pretty printed as

hello (this
is too
long)

Layout as Effects

The layout configuration is modeled by the following data types and effects:

type Direction { Horizontal(); Vertical() }

effect Indent(): Int
effect DefaultIndent(): Int
effect Flow(): Direction

The effect Flow represents the current layouting direction. Also the indentation of the document depends on the context and is therefore modeled as an effect.

Computing the layout of a document to be pretty printed uses the above three effects:

effect Layout = { Indent, DefaultIndent, Flow }

Output: A Stream of Layout Elements

Before we look at examples on how to use the Layout effect, we introduce yet another effect to emit the layouted documents:

interface Emit {
  def emitText(content: String): Unit
  def emitNewline(): Unit
}
def text(content: String) = do emitText(content)
def newline() = do emitNewline()

This way, the resulting document is represented as a stream of text and newline events.

Pretty Printing Combinators

Using the text emitter and the layout configuration, we can express simple functions that express spacing between layout elements.

def space() =
  text(" ")

def spaces(n: Int) =
  if (n > 0) text(" ".repeat(n)) else ()

We can also express a function that, depending on the current flow of directions prints either the supplied text or a newline:

def lineOr(replace: String) = do Flow() match {
  case Horizontal() =>
    text(replace)
  case Vertical() =>
    newline()
    text(" ".repeat(do Indent()))
}

In the case of a newline, it also outputs indentation

def line() =
  lineOr(" ")

def linebreak() =
  lineOr("")

We purposefully distinguish between potential linebreaks that are horizontally rendered as a space (line) and those that are not rendered horizontally (linebreak).

Indentation

Indentation can be configured by locally handling Layout and thereby changing the indentation:

// Uses `n` as the indentation in the given document
def in[R](n: Int) { doc: => R / Layout }: R / Layout =
  try { doc() }
  with Indent { () => resume(n) }

// Adds `j` to the indentation in the current document
def nest[R](j: Int) { doc: => R / Layout }: R / Layout =
  (do Indent() + j).in { doc() }

// Uses the default indentation to nest a document
def nested[R] { doc: => R / Layout }: R / Layout =
  nest(do DefaultIndent()) { doc() }

Fixing Local Layout Choices: Grouping as Handlers

Layout combinators often include a way to group components. This way all components of a group are all layouted horizontally or vertically. Similarly, we can implement handlers that locally fix the direction:

def in[R](dir: Direction) { doc: => R / Layout }: R / Layout =
  try { doc() }
  with Flow { () => resume(dir) }

def horizontal { p: => Unit / Layout }: Unit / Layout =
  Horizontal().in { p() }

def vertical { p: => Unit / Layout }: Unit / Layout =
  Vertical().in { p() }

This way, we can locally determine whether all children of a group should be layouted either horizontally or vertically. However, often we want to mark choice points and then search for a solution by trying out the different choices until a layout fits the screen width. To express these choices, we thus add the following LayoutChoice effect:

interface LayoutChoice {
  def choice(): Direction
  def fail[A](): A
}

The LayoutChoice effect is very similar to the Nondet effect in the parser case study.

We define the following effect alias for pretty printing documents that depend on layout choices:

effect Pretty = { Emit, Layout, LayoutChoice }

Using layout choices, we can express the maybe most important pretty printing combinator:

def group { p: => Unit / Layout } =
  do choice().in { p() }

The group combinator expresses that depending on the result of choice we either layout all children horizontally or vertically.

Using group, we can for instance express variants of line and linebreak:

def softline() =
  group { line() }

def softbreak() =
  group { linebreak() }

Regardless of the enclosing group, a softline inserts a linebreak, if the remainder does not fit into the available space.

Examples

Using the combinators we can express examples from the paper by Swierstra and Chitil.

def example1(l: List[Int]) = {
  text("[");
  l.foreach { n =>
    text(show(n));
    text(",");
    line()
  };
  text("]")
}

This layouts a list by potentially adding a linebreak after each comma. Pretty printing the list [1, 2, 3, 4] at width 5 gives:

-----
[1,
2, 3,
4, ]

For simplicity, we keep the trailing comma.

The following example represents the document (Hi you)!!!

def example2() = {
  group { text("Hi"); line(); text("you") };
  text("!!!")
}

which pretty printed at width 6 yields:

------
Hi
you!!!

The next example illustrates that revising the layout decision for the parent group can also lead to a revision of the layout decision of the child group (due to indentation).

def example3() = {
  group {
    text("this");
    nest(9) {
      line();
      group { text("takes"); line(); text("four") }
    };
    line();
    text("lines")
  }
}

Pretty printing results in:

---------------
this
         takes
         four
lines

Implementing the Pretty Printer

So far, we have seen how the different effects can be used to describe combinators. We are now ready to actually implement pretty printing by providing the necessary handlers. We start by implementing a handler for LayoutChoice that searches for the first successful layout:

def searchLayout[R] { p : => R / LayoutChoice }: Option[R] =
  try { Some(p()) }
  with LayoutChoice {
    def fail[A]() = None()
    def choice() = resume(Horizontal()).orElse { resume(Vertical()) }
  }

The handler is essentially the same as the backtracking implementation of the parser case study but using optional values to indicate success or failure of a layouting attempt. If layouting horizontally fails for a particular choice-point, we resume a second time with Vertical.

Handling the output emitter is straightforward. Here, we simply store all emitted elements in a string:

def writer { p: => Unit / Emit } = {
  var out = "";
  try { p(); out } with Emit {
    def emitText(t) = { out = out ++ t; resume(()) }
    def emitNewline() = { out = out ++ "\n"; resume(()) }
  }
}

We can implement a handler for Layout by intercepting emit effects and keeping track of the current column position.

def printer(width: Int, defaultIndent: Int) { prog: => Unit / { Emit, Layout } } : Unit / { Emit, LayoutChoice } = {
  // the position in the current line
  var pos: Int = 0;

  try { prog() }
  // we allow flow to be flexible on the top-level
  with Flow { () => resume(do choice()) }
  // indentation starts at 0
  with Indent { () => resume(0) }
  // simply handle the default indentation with a constant
  with DefaultIndent { () => resume(defaultIndent) }
  with Emit {
    def emitText(t) = {
      pos = pos + t.length;
      if (pos > width) { do fail() }
      else { text(t); resume(()) }
    }
    def emitNewline() = { pos = 0; newline(); resume(()) }
  }
}

Maybe most interestingly, we update the current position and invoke the effect operation fail, if the document exceeds the width. This will potentially cause backtracking and revision of a preceeding layout decision. If the current text still fits the line, we simply re-emit it.

Finally, we can compose the different handlers to a single pretty printing handler:

def pretty(width: Int) { doc: => Unit / Pretty }: String = {
  val result = searchLayout { writer { printer(width, 2) { doc() } } };
  result.getOrElse { "Cannot print document, since it would overflow." }
}

For simplicity, we output a string “Cannot print …” if the document does not fit the line width and we cannot find a layout. The order of handlers is important: The handler searchLayout, which performs backtracking needs to be the outermost one. This way the mutable variable in writer which contains the currently printed document is reset to its previous state.

Using the Pretty Printing Library

We can of course define some additional combinators to describe documents:

def parens { p: => Unit }: Unit / Pretty = {
  text("("); p(); text(")")
}

def braces { p: => Unit }: Unit / Pretty = {
  text("{"); p(); text("}")
}

and write a pretty printer for the example Tree from the parser case study:

def toDoc(t: Tree): Unit / Pretty = t match {
  case Lit(value) => text(show(value))
  case Var(name) => text(name)
  case Let(name, binding, body) =>
    text("let"); space(); text(name); space(); text("=");
    group {
      nested { line(); toDoc(binding) };
      line();
      text("in")
    };
    group { nested { line(); toDoc(body) } }

  case App(name, arg) =>
    text(name); parens {
      group { nested {
        linebreak();
        toDoc(arg)
      }; linebreak() }
    }
}

We can first use the parser from the parser case study to obtain a parse tree, which we then pretty print:

def parseAndPrint(text: String, width: Int): String =
  parse(text) { parseExpr() } match {
    case ParseSuccess(tree) => pretty(width) { toDoc(tree) }
    case ParseFailure(text) => text
  }

For example, we obtain

def example4() = parseAndPrint("let x = (let y = 2 in 1) in 42", 10)
// ----------
// let x =
//   let y =
//     2
//   in 1
// in 42

Additional Examples

println("-----");
println(pretty(5) { example1([1,2,3,4]) });

println("----------");
println(pretty(10) { example1([1,2,3,4,5,6,7,8,9,1,2,3,4]) });

println("----------")
println(example4())

def example4b() = {
  text("def"); space(); text("foo"); parens {
    group {
      nest(2) {
        linebreak();
        group { text("x"); text(":"); space(); text("Int"); text(",") };
        line();
        group { text("y"); text(":"); space(); text("String") }
      };
      linebreak()
    }
  }
}
def example3b() = {
  example4b();
  space();
  braces {
    group {
      nest(2) {
        line();
        text("var"); space(); text("z"); space(); text("="); space(); text("42"); text(";")
      };
      line()
    }
  }
}

def example6() = {
  group {
    text("this");
    nest(9) {
      line();
      group { text("takes"); line(); text("many"); line(); text("f") }
    };
    line();
    text("l")
  }
}

def example7() = {
  group {
    text("this");
    line();
    text("will");
    nest(9) {
      line();
      group { text("take"); line(); text("many") }
    };
    line();
    text("lines")
  }
}

def helloWorld() = {
  text("hello")
  line()
  text("world")
}

println("------------------------------");
println(pretty(30) { example4b() });
println("--------------------");
println(pretty(20) { example4b() });

println("----------");
println(pretty(50) { example3b() });
println(pretty(15) { example3b() });

println("------");
println(pretty(6) { example2() });

println("---------------");
println(pretty(15) { example3() });

println("--------------");
println(pretty(14) { example6() });

println("--------------");
println(pretty(14) { example7() })

ANF Transformation

In this case study we implement a simple ANF transformation by using an effect to non-locally insert binders for non-trivial expressions.

Source Language

The source language of our transformation is the Tree data type from the parser case study.

To recall the Tree datatype, here is an example tree:

// let x = f(g(42)) in x
val exampleTree: Tree =
  Let("x", App("f", App("g", Lit(42))), Var("x"))

Target Language

The target language, on the other side, is a a language that distiguishes between effectful statements and pure expressions. We could of course also use Tree. However, using a seaparate language that makes this distinction between expressions and statements is more interesting, since the normal form is now encoded in the type.

type Expr {
  CLit(value: Int);
  CVar(name: String)
}
type Stmt {
  CLet(name: String, binding: Stmt, body: Stmt);
  CApp(name: String, arg: Expr);
  CRet(expr: Expr)
}

We prefix all constructors with C... to distinguish them from the source language (“C” for “Core”).

Utility Effects

Before we start with the definition of the transformation, we first define a utitly effect to generate fresh names.

effect Fresh(): String
def freshVars[R] { prog: => R / Fresh } : R = {
    var i = 0;
    try { prog() }
    with Fresh { () => i = i + 1; resume("x" ++ show(i)) }
}

ANF: The Bind Effect

Now to the core of this case study: the Bind effect. Here it becomes visible, why we chose to transform into the language of Stmt and Expr. The Bind effect operation takes a statement and somehow converts it into an expression.

effect Bind(e: Stmt): Expr

We define the ANF transformation from Tree to Stmt by calling Bind everytime we encounter a statement, but would require an expression:

def traverse(e: Tree): Stmt / { Bind, Fresh } = e match {
  case Lit(n) => CRet(CLit(n))
  case Var(n) => CRet(CVar(n))
  case App(name, arg) =>
    // Here we use bind since other than App, CApp requires an expression
    CApp(name, do Bind(traverse(arg)))
  case Let(x, b, body) =>
    // here we use the handler `bindHere` to mark positions where bindings
    // should be inserted.
    CLet(x, bindHere { traverse(b) }, bindHere { traverse(body) })
}

def bindHere { prog: => Stmt / Bind } : Stmt / Fresh =
  try { prog() }
  with Bind { (e) =>
    val id = do Fresh()
    CLet(id, e, resume(CVar(id)))
  }

The handler bindHere handles Bind by generating a fresh name and inserting a let binding. Note, that the let binding will enclose the overall result of prog. It is inserted on the outside.

The overall ANF transformation is simply a matter of composing the two handlers and calling traverse:

def translate(e: Tree): Stmt =
    freshVars { bindHere { traverse(e) } }

For our example, calling translate results in:

val exampleResult = translate(exampleTree)
//> CLet(x, CLet(xxxx1, CRet(CLit(42)),
//    CLet(x2, CApp(g, CVar(x1)), CApp(f, CVar(x2)))),
//    CRet(CVar(x)))

which corresponds to let x = (let x1 = 42 in let x2 = g(x1) in f(x2)) in x.

The Full Pipeline: Combining the Case Studies

So far, we have defined a lexer that handles the Next effect, a parser that uses the Next effect and nondeterminism, a pretty printer that again uses some form of nondeterminism, and finally an ANF transformation that non-locally inserts bindings.

Of course, we can combine the different components into one bing pipeline. We start by defining a pretty printer for the language of expressions and statements:

def toDocExpr(t: Expr): Unit / Pretty = t match {
    case CLit(value) => text(value.show)
    case CVar(name)  => text(name)
}

def toDocStmt(s: Stmt): Unit / Pretty = s match {
    case CLet(name, binding, body) =>
        text("let"); space(); text(name); space(); text("=");
        group {
            nested { line(); toDocStmt(binding) };
            line();
            text("in")
        };
        group { nested { line(); toDocStmt(body) } }

    case CApp(name, arg) =>
        text(name); parens {
            group { nested {
                linebreak();
                toDocExpr(arg)
            }; linebreak() }
        }

    case CRet(expr) =>
        text("return"); space(); toDocExpr(expr)
}

def pretty(s: Stmt) = pretty(40) { toDocStmt(s) }

Using the pretty printer, we can print our example result from above:

val examplePretty = pretty(exampleResult)
// let x = let x1 = return 42 in let x2 =
//      g(x1)
//    in f(x2) in return x

Finally, we can define our pipeline as lexing, parsing, transforming, and pretty printing our input:

def pipeline(input: String): String =
  parse(input) { parseExpr() } match {
    case ParseSuccess(tree) => pretty(translate(tree))
    case ParseFailure(msg) => msg
  }

Here we use pipeline to translate some examples:

inspect(exampleResult)
println(examplePretty)

println("----")
println(pipeline("42"))

println("----")
println(pipeline("let x = 4 in 42"))

println("----")
println(pipeline("let x = let y = 2 in 1 in 42"))

println("----")
println(pipeline("let x = (let y = 2 in 1) in 42"))

println("----")
println(pipeline("let x = (let y = f(42) in 1) in 42"))