In this post I will implement the union-find algorithm in Scala, first in an impure way and then in a purely functional manner, so without any state or side effects. Then we can check both implementations and compare the code and also the performance.

The reason I chose union-find for this blog is that it is relatively simple. It is a classic algorithm that is used to solve the following problem: suppose we have a set of objects. Each of them can be connected to zero or more others. And connections are transitive: if A is connected to B and B is connected to C, then A is connected to C as well. Now we take two objects from the set, and we want to know: are they connected or not?
This problem comes up in a number of area's, such as in social networks (are two people connected via friends or not), or in image processing (are pixels connected or separated).
Because the total number of objects and connections in the set might be huge, the performance of the algorithm is important.

Quick Union

The implementation I chose is the so called Quick Union implementation. It scales well but there are still faster implementations around, one of which is given in the references below the article. For this post I chose to keep things simple so we can focus on comparing the two implementations.

The algorithm keeps track of connected elements with a data structure: it represents every element as a Node which points to another element to which it is connected. Every Node points to only one Node it is connected to, and this Node it is called its parent. This way, groups of connected Nodes form trees. The root of such a connected tree is a Node which has an empty parent property.
When the question is asked if two Nodes are connected, the algorithm looks up the roots of the connected trees of both Nodes and checks if they are the same.

The tricky part in union find algorithms is to be able to add new connections to a set of elements without losing too much performance. The data structure with the connected trees enables us to do this really well. We start by looking up the root of both elements, and then set the parent element of one tree to the root of the other tree.

Some care must still be taken when doing this, because over time connected trees might become unbalanced. Therefore the size of every tree is kept in its root Node; when connecting two subtrees, the smaller one is always added to the larger one. This guarantees that all subtrees remain balanced.

This was only a brief description of the algorithm but there are some excellent explanations on the Internet. Here is a nice one because it is visual and interactive: visual algo

The Impure Implementation

Now let's see some code! The impure implementation:

import scala.annotation.tailrec

class IUnionFind(val size: Int) {

  private case class Node(var parent: Option[Int], var treeSize: Int)

  private val nodes = Array.fill[Node](size)(new Node(None, 1))

  def union(t1: Int, t2: Int): IUnionFind = {
    if (t1 == t2) return this

    val root1 = root(t1)
    val root2 = root(t2)
    if (root1 == root2) return this

    val node1 = nodes(root1)
    val node2 = nodes(root2)

    if (node1.treeSize < node2.treeSize) {
      node1.parent = Some(t2)
      node2.treeSize += node1.treeSize
    } else {
      node2.parent = Some(t1)
      node1.treeSize += node2.treeSize
    }
    this
  }

  def connected(t1: Int, t2: Int): Boolean = t1 == t2 || root(t1) == root(t2)

  @tailrec
  private def root(t: Int): Int = nodes(t).parent match {
    case None => t
    case Some(p) => root(p)
  }
}

As you can see I used an array of Nodes to represent the connected components. Most textbook implementations use two integer arrays: one for the parents of every element, and the other one for the tree sizes of the components to which the elements belong. Memory wise that is a more efficient implementation than mine. But apart from that the concept of the algorithm stays the same and in terms of speed the difference doesn't matter much. I do think that using Node objects is more readable than having two integer arrays, so I chose for the Nodes.

The purely functional implementation

import scala.annotation.tailrec

case class Node(parent: Option[Int], treeSize: Int)

object FUnionFind {
  def create(size: Int): FUnionFind = {
    val nodes = Vector.fill(size)(Node(None, 1))
    new FUnionFind(nodes)
  }
}

class FUnionFind(nodes: Vector[Node]) {

  def union(t1: Int, t2: Int): FUnionFind = {
    if (t1 == t2) return this

    val root1 = root(t1)
    val root2 = root(t2)
    if (root1 == root2) return this

    val node1 = nodes(root1)
    val node2 = nodes(root2)
    val newTreeSize = node1.treeSize + node2.treeSize

    val (newNode1, newNode2) =
      if (node1.treeSize < node2.treeSize) {
        val newNode1 = Node(Some(t2), newTreeSize)
        val newNode2 = Node(node2.parent, newTreeSize)

        (newNode1, newNode2)
      } else {
        val newNode2 = FNode(Some(t1), newTreeSize)
        val newNode1 = FNode(node1.parent, newTreeSize)

        (newNode1, newNode2)
      }
    val newNodes = nodes.updated(root1, newNode1).updated(root2, newNode2)
    new FUnionFind(newNodes)
  }

  def connected(t1: Int, t2: Int): Boolean = t1 == t2 || root(t1) == root(t2)

  @tailrec
  private def root(t: Int): Int = nodes(t).parent match {
    case None => t
    case Some(p) => root(p)
  }
}

Comparing to the first implementation, some parts remained the same. Such as the Node, except for the fact that it is not an inner class anymore. Also the connected and the root methods did not change.
What did change is the method which deals with updating the connections: union. In the purely functional implementation we can't update any array, so instead it creates a new FUnionFind object and returns it at the end. Also two Node objects need to be created when subtrees are merged; the root of the smaller one because it gets a new parent, and the root of the larger one because its tree size needs to be increased.
Perhaps surprisingly, the pure implementation needs more lines of code than the impure one.

The Performance

The pure implementation has to do a bit of extra work when it creates the new objects in its union method. The question is how much this costs in terms of performance.
To find this out, I ran both implementations through a series of performance tests (using ScalaMeter) where I added a large number of connections to a set of objects. I added a (impure) Java 8 implementation to the test as well.
Here are the results:

Nr of elements and connections Impure Pure Java 8
10000 2.2 s 3.8 s 2.3 s
15000 4.4 s 7.9 s 4.2 s
20000 6.2 s 10.3 s 6.3 s

Not surprisingly, the time grows with the number of connections and elements. The growth is a bit faster than linear, that's because the asymptotic time complexity of the quick union algorithm is of the order n log(n).
The pure algorithm is about 65% slower than the impure implementation. The cause is clear: in every call to union the pure algorithm has to allocate and garbage collect three extra objects.

For completeness I added Java 8 to the test too. The code is not given here but if you're interested, there's a link to the complete source below the article. Its implementation is really similar to the Scala version.

Conclusion

Purely functional code has a couple of advantages over non pure implementations; because of the lack of side effects it can make it easier to reason about blocks of code, also concurrency becomes easier because there is no shared state.
In general it also leads to more concise code because collection methods like map and filter can easily be used. But in this example that was not the case, the pure implementation even needed a few lines extra.
The biggest disadvantage of the pure union-find algorithm was its performance. It depends on the requirements of the project where the code is used if this is a showstopper, or if the better concurrent behavior of the pure implementation outweighs this disadvantage.

Explanation of a faster union-find with path compression
All the source code in the article, including the tests