What is the most efficient way to iterate collection in Groovy? Let's play with JMH!
I guess you may heard about Groovy’s Collection.each(Closure cl)
method - it was introduced 15 years ago [1] and it was a great alternative for a good old for-loop, for-each or even using an iterator approach. You may also heard, that you should not overuse it, because creating a closure to do such simple operation like collection iteration is an overhead. But what if I tell you that nothing could be further from the truth - Groovy’s each
method may be faster than iterator or Java’s for-each. Sounds interesting? Enjoy the reading!
Preparation
Before we dig deeper, let’s define the initial conditions. We are going to measure an execution time of a few different iteration variants. Each variant will do the same simple operation - it will accumulate the sum of numbers from 0 to 10 millions. We will do it using AtomicLong
object and in the most imperative way possible - it is not about finding the most efficient sum operation - it is about giving all variants a simple job to do, so we can focus on iteration process and draw a conclusion after all.
We are going to use JMH benchmark tool and its Gradle plugin. |
What methods are we going to test? We will compare:
Groovy’s
each
methodJava’s for-loop approach
Java’s for-each approach
Java’s iterator approach
Java 8
forEach
method using closureJava 8
forEach
method using anonymous class
This test uses Groovy 2.4 which does not support Java 8 lambda expressions - that is why we will use Java 8 forEach method with an anonymous class instead. |
The code
Let’s start with build.gradle
file
buildscript {
repositories {
jcenter()
maven {
url "https://plugins.gradle.org/m2/"
}
}
dependencies {
classpath "me.champeau.gradle:jmh-gradle-plugin:0.4.7"
}
}
apply plugin: "idea"
apply plugin: "groovy"
apply plugin: "me.champeau.gradle.jmh"
repositories {
jcenter()
}
dependencies {
compile 'org.codehaus.groovy:groovy-all:2.4.12'
}
jmh {
include = ['bench\\.*'] (1)
benchmarkMode = ['avgt'] (2)
timeUnit = 'ms' (3)
iterations = 120 (4)
timeOnIteration = '1s' (5)
warmup = '60s' (6)
warmupIterations = 1 (7)
batchSize = 1
fork = 1
}
1 | Load test classes from bench. package |
2 | Use average execution time measurement |
3 | Measure time in milliseconds |
4 | Execute 120 iterations |
5 | Each iteration is limited to 1 second |
6 | Warm-up for 60 seconds |
7 | Use a single iteration to warm-up |
And now let’s take a look at our benchmark test class:
package bench
import org.openjdk.jmh.annotations.Benchmark
import org.openjdk.jmh.annotations.Scope
import org.openjdk.jmh.annotations.State
import java.util.concurrent.atomic.AtomicLong
import java.util.function.Consumer
@State(Scope.Benchmark)
class GroovyIterationBenchmarks {
final List<Integer> numbers = 0..10_000_000 (1)
@Benchmark
AtomicLong eachTest() { (2)
final AtomicLong result = new AtomicLong()
numbers.each { result.addAndGet(it) }
return result
}
@Benchmark
AtomicLong forEachTest() { (3)
final AtomicLong result = new AtomicLong()
for (int number : numbers) {
result.addAndGet(number)
}
return result
}
@Benchmark
AtomicLong forLoopTest() { (4)
final AtomicLong result = new AtomicLong()
for (int i = 0; i < numbers.size(); i++) {
result.addAndGet(numbers.get(i))
}
return result
}
@Benchmark
AtomicLong iteratorTest() { (5)
final AtomicLong result = new AtomicLong()
final Iterator<Integer> iterator = numbers.iterator()
while (iterator.hasNext()) {
result.addAndGet(iterator.next())
}
return result
}
@Benchmark
AtomicLong java8ForEachWithClosureTest() { (6)
final AtomicLong result = new AtomicLong()
numbers.forEach { result.addAndGet((int) it) }
return result
}
@Benchmark
AtomicLong java8ForEachWithAnonymousClassTest() { (7)
final AtomicLong result = new AtomicLong()
numbers.forEach(new Consumer<Integer>() {
@Override
void accept(Integer number) {
result.addAndGet(number)
}
})
return result
}
}
1 | List of numbers from 0 to 10 millions |
2 | Groovy each {} test case |
3 | Old Java for-each loop test case |
4 | Old Java for-loop test case |
5 | Old Java iterator test case |
6 | Java 8 forEach() test case with closure in place of a lambda expression |
7 | Java 8 forEach() test case with an anonymous class in place of lambda expression |
Above example can be cloned from icon: wololock/groovy-jmh |
The results
We are ready to execute the test using Gradle:
./gradlew jmh
And after about 7 minutes we can take a look at the results:
Benchmark Mode Cnt Score Error Units
GroovyBench.eachTest avgt 120 652,584 ± 2,064 ms/op
GroovyBench.forEachTest avgt 120 221,790 ± 1,675 ms/op
GroovyBench.forLoopTest avgt 120 533,534 ± 2,521 ms/op
GroovyBench.iteratorTest avgt 120 369,492 ± 0,930 ms/op
GroovyBench.java8ForEachWithAnonymousClassTest avgt 120 248,371 ± 2,803 ms/op
GroovyBench.java8ForEachWithClosureTest avgt 120 785,309 ± 3,096 ms/op
The benchmarks shows clearly that using Groovy
each
with a closure is almost three times slower than good old Java for-each loop (653
ms versus222
ms).Java for-each and Java 8
forEach
with anonymous class are pretty close -222
ms versus248
ms.The slowest variant was Java 8
forEach
with a closure in place of a lambda expression - it took785
ms to execute (133
ms more than Groovyeach
).
No matter which variant won in this run it still feels like there is something wrong - iterating collection of 10 millions integers took 304
milliseconds at best, which is still quite slow. The reason of that is because we were testing Groovy’s dynamic method invocation which comes with some overhead. Let’s turn on static compilation and see how it works.
Laptop specs: JDK 1.8.0_162 (Java HotSpot™ 64-Bit Server VM, 25.162-b12), Groovy 2.4.12, Intel® Core™ i7-4900MQ CPU @ 2.80GHz (4 cores, cache size 8192 KB), 16 GB RAM, OS: Fedora 26 (64 bit) |
Full log can be found here: c4039cc75a359660b11f89bc8abd6629
The improvement: static compilation
Without further ado, let’s add @CompileStatic
and @TypeChekced
annotations to our GroovyBench
class:
package bench
import groovy.transform.CompileStatic
import groovy.transform.TypeChecked
import org.openjdk.jmh.annotations.Benchmark
import org.openjdk.jmh.annotations.Scope
import org.openjdk.jmh.annotations.State
import java.util.concurrent.atomic.AtomicLong
import java.util.function.Consumer
@State(Scope.Benchmark)
@CompileStatic (1)
@TypeChecked (2)
class GroovyBench {
final List<Integer> numbers = 0..10_000_000
// the same benchmark methods
}
Let’s run ./gradlew clean jmh
and see the results:
Benchmark Mode Cnt Score Error Units
GroovyBench.eachTest avgt 120 91,897 ± 0,346 ms/op
GroovyBench.forEachTest avgt 120 96,422 ± 0,550 ms/op
GroovyBench.forLoopTest avgt 120 139,119 ± 0,723 ms/op
GroovyBench.iteratorTest avgt 120 103,568 ± 0,648 ms/op
GroovyBench.java8ForEachWithAnonymousClassTest avgt 120 102,460 ± 2,473 ms/op
GroovyBench.java8ForEachWithClosureTest avgt 120 400,481 ± 1,036 ms/op
As you can see, enabling static compilation was a game changer! To sum it up:
Groovy
each
recorded the best result -91.897
ms (previously:652.584
ms)The second best result belongs to Java for-each -
96.422
ms (previously:221.790
ms)Java 8
forEach
and iterator recorded almost the same result -102.460
ms and103.568
ms accordingly.And again the slowest result belongs to Java 8
forEach
with a closure -400.481
ms (previously: `785.309 ` ms)
As you can see Groovy each
method with a closure can be faster than other variants when static compilation is enabled. But is it always like that? It depends.
Full log can be found here: 161aae90bcdaabd0fe6144f5339d1727
Small collection size
Let’s run the same benchmark, but this time let’s limit the numbers of elements in the input list from 10 millions to 10 thousands and see what the results are.
Benchmark Mode Cnt Score Error Units
GroovyBench.eachTest avgt 120 0,087 ± 0,001 ms/op
GroovyBench.forEachTest avgt 120 0,079 ± 0,001 ms/op
GroovyBench.forLoopTest avgt 120 0,157 ± 0,002 ms/op
GroovyBench.iteratorTest avgt 120 0,079 ± 0,002 ms/op
GroovyBench.java8ForEachWithAnonymousClassTest avgt 120 0,085 ± 0,001 ms/op
GroovyBench.java8ForEachWithClosureTest avgt 120 0,402 ± 0,002 ms/op
Things got change as you can see.
For 10k size collection the best result gave Java for-each and iterator -
0.079
ms average.Java 8
forEach
with anonymous class was only0.006
ms slower -0.085
ms average.Groovy
each
was only0.008
ms slower than the best result -0.087
ms average.And again the slowest variant was Java 8
forEach
with a closure -0.402
ms average.
Even though Groovy each
didn’t record the best result this time, it is still very close to the fastest variant.
Full log can be found here: 3a9b1e169c58abbfb4e067aa69b9bfc0
Benchmarking Java
Before we close this article, let’s take a quick look at the Java benchmark results to get a better understanding how Groovy efficiency differs from Java. Below you can find results of Java benchmark test for two variants - old Java for-each and iterator:
Benchmark Mode Cnt Score Error Units
JavaBench.javaForEach avgt 120 8,839 ± 0,011 ms/op
JavaBench.javaIteratorTest avgt 120 8,865 ± 0,011 ms/op
Let’s compare the results:
Groovy for-each took in average
96.422
ms while Java did the same job in approximately8.839
ms.Groovy iterator test took in average
103.568
ms while Java did the same job in approximately8.865
ms.
Full log can be found here: c895114949be2820b0fada72df099fcf
Conclusion
Now it is the good time to draw a conclusion.
When you program in statically compiled Groovy there is no difference if you use
each
with a closure or you stick to old for-each constructions that don’t require creating a closure.When you program in a dynamic Groovy and you need more efficient iteration algorithm - extract code to a statically compiled class, otherwise you will lost a lot of milliseconds in case of a huge collections.
When you can choose between Groovy and Java and you need blazing fast solution - pick Java.
And remember that "premature optimization is root of all evil" [2] - before you start refactoring your iteration code make sure that switching from one variant to another will give you a real boost. Your application most probably spends most of the time on I/O (e.g. loading data from the database) and saving a millisecond here or there might not be worth the effort.
I hope you have enjoyed reading this blog post. Feel free to leave a comment in the section below, I would love to hear your opinion. Until the next time!
DefaultGroovyMethods.each(Object self, Closure cl)
method comes from September 11th 2003
0 Comments