What is the most efficient way to iterate collection in Groovy? Let's play with JMH!
I guess you may heard about Groovy’s Collection.each(Closure cl) method - it was introduced 15 years ago [1] and it was a great alternative for a good old for-loop, for-each or even using an iterator approach. You may also heard, that you should not overuse it, because creating a closure to do such simple operation like collection iteration is an overhead. But what if I tell you that nothing could be further from the truth - Groovy’s each method may be faster than iterator or Java’s for-each. Sounds interesting? Enjoy the reading!
Preparation
Before we dig deeper, let’s define the initial conditions. We are going to measure an execution time of a few different iteration variants. Each variant will do the same simple operation - it will accumulate the sum of numbers from 0 to 10 millions. We will do it using AtomicLong object and in the most imperative way possible - it is not about finding the most efficient sum operation - it is about giving all variants a simple job to do, so we can focus on iteration process and draw a conclusion after all.
| We are going to use JMH benchmark tool and its Gradle plugin. |
What methods are we going to test? We will compare:
Groovy’s
eachmethodJava’s for-loop approach
Java’s for-each approach
Java’s iterator approach
Java 8
forEachmethod using closureJava 8
forEachmethod using anonymous class
This test uses Groovy 2.4 which does not support Java 8 lambda expressions - that is why we will use Java 8 forEach method with an anonymous class instead. |
The code
Let’s start with build.gradle file
buildscript {
repositories {
jcenter()
maven {
url "https://plugins.gradle.org/m2/"
}
}
dependencies {
classpath "me.champeau.gradle:jmh-gradle-plugin:0.4.7"
}
}
apply plugin: "idea"
apply plugin: "groovy"
apply plugin: "me.champeau.gradle.jmh"
repositories {
jcenter()
}
dependencies {
compile 'org.codehaus.groovy:groovy-all:2.4.12'
}
jmh {
include = ['bench\\.*'] (1)
benchmarkMode = ['avgt'] (2)
timeUnit = 'ms' (3)
iterations = 120 (4)
timeOnIteration = '1s' (5)
warmup = '60s' (6)
warmupIterations = 1 (7)
batchSize = 1
fork = 1
}| 1 | Load test classes from bench. package |
| 2 | Use average execution time measurement |
| 3 | Measure time in milliseconds |
| 4 | Execute 120 iterations |
| 5 | Each iteration is limited to 1 second |
| 6 | Warm-up for 60 seconds |
| 7 | Use a single iteration to warm-up |
And now let’s take a look at our benchmark test class:
package bench
import org.openjdk.jmh.annotations.Benchmark
import org.openjdk.jmh.annotations.Scope
import org.openjdk.jmh.annotations.State
import java.util.concurrent.atomic.AtomicLong
import java.util.function.Consumer
@State(Scope.Benchmark)
class GroovyIterationBenchmarks {
final List<Integer> numbers = 0..10_000_000 (1)
@Benchmark
AtomicLong eachTest() { (2)
final AtomicLong result = new AtomicLong()
numbers.each { result.addAndGet(it) }
return result
}
@Benchmark
AtomicLong forEachTest() { (3)
final AtomicLong result = new AtomicLong()
for (int number : numbers) {
result.addAndGet(number)
}
return result
}
@Benchmark
AtomicLong forLoopTest() { (4)
final AtomicLong result = new AtomicLong()
for (int i = 0; i < numbers.size(); i++) {
result.addAndGet(numbers.get(i))
}
return result
}
@Benchmark
AtomicLong iteratorTest() { (5)
final AtomicLong result = new AtomicLong()
final Iterator<Integer> iterator = numbers.iterator()
while (iterator.hasNext()) {
result.addAndGet(iterator.next())
}
return result
}
@Benchmark
AtomicLong java8ForEachWithClosureTest() { (6)
final AtomicLong result = new AtomicLong()
numbers.forEach { result.addAndGet((int) it) }
return result
}
@Benchmark
AtomicLong java8ForEachWithAnonymousClassTest() { (7)
final AtomicLong result = new AtomicLong()
numbers.forEach(new Consumer<Integer>() {
@Override
void accept(Integer number) {
result.addAndGet(number)
}
})
return result
}
}| 1 | List of numbers from 0 to 10 millions |
| 2 | Groovy each {} test case |
| 3 | Old Java for-each loop test case |
| 4 | Old Java for-loop test case |
| 5 | Old Java iterator test case |
| 6 | Java 8 forEach() test case with closure in place of a lambda expression |
| 7 | Java 8 forEach() test case with an anonymous class in place of lambda expression |
| Above example can be cloned from icon: wololock/groovy-jmh |
The results
We are ready to execute the test using Gradle:
./gradlew jmhAnd after about 7 minutes we can take a look at the results:
Benchmark Mode Cnt Score Error Units
GroovyBench.eachTest avgt 120 652,584 ± 2,064 ms/op
GroovyBench.forEachTest avgt 120 221,790 ± 1,675 ms/op
GroovyBench.forLoopTest avgt 120 533,534 ± 2,521 ms/op
GroovyBench.iteratorTest avgt 120 369,492 ± 0,930 ms/op
GroovyBench.java8ForEachWithAnonymousClassTest avgt 120 248,371 ± 2,803 ms/op
GroovyBench.java8ForEachWithClosureTest avgt 120 785,309 ± 3,096 ms/opThe benchmarks shows clearly that using Groovy
eachwith a closure is almost three times slower than good old Java for-each loop (653ms versus222ms).Java for-each and Java 8
forEachwith anonymous class are pretty close -222ms versus248ms.The slowest variant was Java 8
forEachwith a closure in place of a lambda expression - it took785ms to execute (133ms more than Groovyeach).
No matter which variant won in this run it still feels like there is something wrong - iterating collection of 10 millions integers took 304 milliseconds at best, which is still quite slow. The reason of that is because we were testing Groovy’s dynamic method invocation which comes with some overhead. Let’s turn on static compilation and see how it works.
| Laptop specs: JDK 1.8.0_162 (Java HotSpot™ 64-Bit Server VM, 25.162-b12), Groovy 2.4.12, Intel® Core™ i7-4900MQ CPU @ 2.80GHz (4 cores, cache size 8192 KB), 16 GB RAM, OS: Fedora 26 (64 bit) |
Full log can be found here: c4039cc75a359660b11f89bc8abd6629
The improvement: static compilation
Without further ado, let’s add @CompileStatic and @TypeChekced annotations to our GroovyBench class:
package bench
import groovy.transform.CompileStatic
import groovy.transform.TypeChecked
import org.openjdk.jmh.annotations.Benchmark
import org.openjdk.jmh.annotations.Scope
import org.openjdk.jmh.annotations.State
import java.util.concurrent.atomic.AtomicLong
import java.util.function.Consumer
@State(Scope.Benchmark)
@CompileStatic (1)
@TypeChecked (2)
class GroovyBench {
final List<Integer> numbers = 0..10_000_000
// the same benchmark methods
}Let’s run ./gradlew clean jmh and see the results:
Benchmark Mode Cnt Score Error Units
GroovyBench.eachTest avgt 120 91,897 ± 0,346 ms/op
GroovyBench.forEachTest avgt 120 96,422 ± 0,550 ms/op
GroovyBench.forLoopTest avgt 120 139,119 ± 0,723 ms/op
GroovyBench.iteratorTest avgt 120 103,568 ± 0,648 ms/op
GroovyBench.java8ForEachWithAnonymousClassTest avgt 120 102,460 ± 2,473 ms/op
GroovyBench.java8ForEachWithClosureTest avgt 120 400,481 ± 1,036 ms/opAs you can see, enabling static compilation was a game changer! To sum it up:
Groovy
eachrecorded the best result -91.897ms (previously:652.584ms)The second best result belongs to Java for-each -
96.422ms (previously:221.790ms)Java 8
forEachand iterator recorded almost the same result -102.460ms and103.568ms accordingly.And again the slowest result belongs to Java 8
forEachwith a closure -400.481ms (previously: `785.309 ` ms)
As you can see Groovy each method with a closure can be faster than other variants when static compilation is enabled. But is it always like that? It depends.
Full log can be found here: 161aae90bcdaabd0fe6144f5339d1727
Small collection size
Let’s run the same benchmark, but this time let’s limit the numbers of elements in the input list from 10 millions to 10 thousands and see what the results are.
Benchmark Mode Cnt Score Error Units
GroovyBench.eachTest avgt 120 0,087 ± 0,001 ms/op
GroovyBench.forEachTest avgt 120 0,079 ± 0,001 ms/op
GroovyBench.forLoopTest avgt 120 0,157 ± 0,002 ms/op
GroovyBench.iteratorTest avgt 120 0,079 ± 0,002 ms/op
GroovyBench.java8ForEachWithAnonymousClassTest avgt 120 0,085 ± 0,001 ms/op
GroovyBench.java8ForEachWithClosureTest avgt 120 0,402 ± 0,002 ms/opThings got change as you can see.
For 10k size collection the best result gave Java for-each and iterator -
0.079ms average.Java 8
forEachwith anonymous class was only0.006ms slower -0.085ms average.Groovy
eachwas only0.008ms slower than the best result -0.087ms average.And again the slowest variant was Java 8
forEachwith a closure -0.402ms average.
Even though Groovy each didn’t record the best result this time, it is still very close to the fastest variant.
Full log can be found here: 3a9b1e169c58abbfb4e067aa69b9bfc0
Benchmarking Java
Before we close this article, let’s take a quick look at the Java benchmark results to get a better understanding how Groovy efficiency differs from Java. Below you can find results of Java benchmark test for two variants - old Java for-each and iterator:
Benchmark Mode Cnt Score Error Units
JavaBench.javaForEach avgt 120 8,839 ± 0,011 ms/op
JavaBench.javaIteratorTest avgt 120 8,865 ± 0,011 ms/opLet’s compare the results:
Groovy for-each took in average
96.422ms while Java did the same job in approximately8.839ms.Groovy iterator test took in average
103.568ms while Java did the same job in approximately8.865ms.
Full log can be found here: c895114949be2820b0fada72df099fcf
Conclusion
Now it is the good time to draw a conclusion.
When you program in statically compiled Groovy there is no difference if you use
eachwith a closure or you stick to old for-each constructions that don’t require creating a closure.When you program in a dynamic Groovy and you need more efficient iteration algorithm - extract code to a statically compiled class, otherwise you will lost a lot of milliseconds in case of a huge collections.
When you can choose between Groovy and Java and you need blazing fast solution - pick Java.
And remember that "premature optimization is root of all evil" [2] - before you start refactoring your iteration code make sure that switching from one variant to another will give you a real boost. Your application most probably spends most of the time on I/O (e.g. loading data from the database) and saving a millisecond here or there might not be worth the effort.
I hope you have enjoyed reading this blog post. Feel free to leave a comment in the section below, I would love to hear your opinion. Until the next time!
DefaultGroovyMethods.each(Object self, Closure cl) method comes from September 11th 2003


0 Comments