If you use Groovy for scripting or other similar tasks you probably faced a situation where you get an input as a text and you need to process it e.g. split by some delimiter and continue working with extracted values. In this post I will show you how to do it in 3 different ways.

Preparation

Let’s start with defining input data and expected result. We will use following simple text input:

1;Joe Doe;[email protected]
2;Paul Doe;[email protected]
3;Mark Doe
4;Clark Doe;[email protected];2

This is a CSV-like input. We will iterate over each line, split by ; and generate output similar to:

id: 1, name: Joe Doe, email: [email protected], sibling: null
id: 2, name: Paul Doe, email: [email protected], sibling: null
id: 3, name: Mark Doe, email: null, sibling: null
id: 4, name: Clark Doe, email: [email protected], sibling: 2

Ex. 1: Use List.get(int index) to extract values

This is the most Java-like way to do it. List.get(int index) method has one significant drawback - it throws IndexOutOfBoundsException when we are trying to get a value for non existing index. In our case only line 4 contains all 4 expected values, so for all other cases we have to be careful and prevent this exception from throwing.

def text = '''1;Joe Doe;[email protected]
2;Paul Doe;[email protected]
3;Mark Doe
4;Clark Doe;[email protected];2
'''

text.eachLine { line ->
    def arr = line.tokenize(';')

    println "id: ${arr.size() > 0 ? arr.get(0) : null}, name: ${arr.size() > 1 ? arr.get(1) : null}, email: ${arr.size() > 2 ? arr.get(2) : null}, sibling: ${arr.size() > 3 ? arr.get(3) : null}"
}

Ex. 2: Use Groovy subscript operator

The previous example looks like there is something wrong with it. Luckily Groovy overrides index operator for lists and it makes expressions like arr[4] safe from IndexOutOfBoundsException. Thanks to this feature we can simplify the previous example to:

def text = '''1;Joe Doe;[email protected]
2;Paul Doe;[email protected]
3;Mark Doe
4;Clark Doe;[email protected];2
'''

text.eachLine { line ->
    def arr = line.tokenize(';')

    println "id: ${arr[0]}, name: ${arr[1]}, email: ${arr[2]}, sibling: ${arr[3]}"
}

Ex. 3: Use Groovy multiple assignment feature

There is even more Groovy way to get this job done - using multiple assignment feature. It allows us to forget about that tokenize produces a list and we can assign a result of this operation directly to a named variables and Groovy will assign null if the value for given variable does not exist.

def text = '''1;Joe Doe;[email protected]
2;Paul Doe;[email protected]
3;Mark Doe
4;Clark Doe;[email protected];2
'''

text.eachLine { line ->
    def (id, name, email, sibling) = line.tokenize(';')

    println "id: ${id}, name: ${name}, email: ${email}, sibling: ${sibling}"
}

Szymon Stepniak

Groovista, Upwork's Top Rated freelancer, Toruń Java User Group founder, open source contributor, Stack Overflow addict, bedroom guitar player. I walk through e.printStackTrace() so you don't have to.