Measuring Scala performance with JMH

Have you ever found yourself in a position where you wanted to know which variant of the same code works faster? Have you ever had a need to prove that "optimization" X only makes the code slower? If you have, then you probably used JMH. Not so long ago JVM Engineers published compiled JMH into maven. That is the tool's home page: http://openjdk.java.net/projects/code-tools/jmh/

mvn archetype:generate \
  -DinteractiveMode=false \
  -DarchetypeGroupId=org.openjdk.jmh \
  -DarchetypeArtifactId=jmh-scala-benchmark-archetype \
  -DgroupId=org.sample \
  -DartifactId=test \
  -Dversion=1.0

Add the desired Scala runtime version in the pom.xml

Write the test:

class MyBenchmark {

  @GenerateMicroBenchmark
  def testList(): Any = { 
    List(1,2).foldLeft(0)(_ + _)
  }

  @GenerateMicroBenchmark
  def testAdd(): Any = { 
    1 + 2 
  }

  @GenerateMicroBenchmark
  def testSeq(): Any = { 
    Seq(1,2).foldLeft(0)(_ + _)
  }

  @GenerateMicroBenchmark
  def testSet(): Any = {
    Set(1,2).foldLeft(0)(_ + _)
  }
}

Run the test:

mvn clean install
java -jar target/microbenchmarks.jar

Analyze the results:

Benchmark                    Mode   Samples         Mean   Mean error    Units
o.s.MyBenchmark.testAdd     thrpt       200   427263.186      950.821   ops/ms
o.s.MyBenchmark.testList    thrpt       200    25160.431      115.458   ops/ms
o.s.MyBenchmark.testSeq     thrpt       200    33609.219      183.983   ops/ms
o.s.MyBenchmark.testSet     thrpt       200    18407.093       87.920   ops/ms

As you can see, Scala collections do add significant overhead to the "add" operation. It is a little surprising that foldLeft operation performance differs so much among different collections. I expected them to be roughly the same. While in fact basic Seq is almost two times faster than Set.

Happy benchmarking!