Groovy: Working with Collections - Part 2

Abstract

Groovy is a dynamic language on Java platform. Groovy provides extensive support for working with collection, with native support for list and map literals.In this article, I will explore the options for working with collections effectively. We will explore various collection types, internal iterators, map reduce methods, method chaining to using Java 8 streams API. Also we would see how Java collections get that extra power when they enter the Groovy world.

Sorting

Let's take a look at simple case of sorting numbers.

def numbers = [2, 3, 1, 4, 5]
def numbersAscending = numbers.sort()
println numbersAscending // [1, 2, 3, 4, 5]
println numbers // [1, 2, 3, 4, 5]

Well, that is something which you could achieve with plain Java too. My intention here is to bring out the not so good things about this code and show you what is in Groovy to make it better.

If you observe the output produced by the above code, invoking sort method returned a list object with numbers in ascending order. However unfortunately, it modifies the original list as well, thus you end up loosing your original collection. But when java lists enter the world of Groovy, they get another flavour of sort method, which accepts a boolean parameter. If you pass a false, a new list will be created for you, instead of modifying the original one.

def numbers = [2, 3, 1, 4, 5]
def numbersAscending = numbers.sort(false)
println numbersAscending // [1, 2, 3, 4, 5]
println numbers // [2, 3, 1, 4, 5]

Let's move to a slightly complex task of sorting objects. Consider the following example

import groovy.transform.ToString

@ToString
class Person implements Comparable<Person>{
	String name
	int age
	
	int compareTo(Person another){
		name <=> another.name
	}
}

def people = [
	new Person(name: 'Mark', age: 30),
	new Person(name: 'Raj', age: 25),
	new Person(name: 'Ajay', age: 35),
	new Person(name: 'Mark', age: 20)
]

println people.sort(false)
// [Person(Ajay, 35), Person(Mark, 30), Person(Mark, 20), Person(Raj, 25)]

The sort method depends on the implementation of Comparable interface to determine the order. In this case, we decided the default ordering to be based on the name property as indicated by the method compareTo. If there is a need to order person objects based on the age, then we need to have a corresponding Comparator object. We will use spaceship operator <=> to simplify our comparator implementation.

def ageComparator = { Person one, Person another ->
	one.age <=> another.age
}
println people.sort(false, ageComparator)
// [Person(Mark, 20), Person(Raj, 25), Person(Mark, 30), Person(Ajay, 35)]

Now let's attempt to order person objects by name and age, which means if two people have the name name, the person who is younger should appear first. We will achieve this by building another comparator.

def nameAndAgeComparator = { Person one, Person another ->
	[{it.name}, {it.age}].findResult { fieldExtractor ->
		fieldExtractor(one) <=> fieldExtractor(another) ?: null
	}
}
println people.sort(false, nameAndAgeComparator)
// [Person(Ajay, 35), Person(Mark, 20), Person(Mark, 30), Person(Raj, 25)]

If you using Groovy 2.4 or above, this can be simpler. By applying groovy.transform.Sortable annotation on the class, you get Comparable implementation automatically.

import groovy.transform.*

@ToString
@Sortable
class Person {
	String name
	int age
}

def people = [
	new Person(name: 'Mark', age: 30),
	new Person(name: 'Raj', age: 25),
	new Person(name: 'Ajay', age: 35),
	new Person(name: 'Mark', age: 20)
]

println people.sort(false)
// [Person(Ajay, 35), Person(Mark, 20), Person(Mark, 30), Person(Raj, 25)]

Since we declared name first and then age, sorting will also follow that order. Had we declared age before name, the auto generated comparator would consider age first and then name. You could influence the Comparable implementation by specifying includes or excludes attributes with Sortable annotation.

Additionally, if you want to sort by age alone, Sortable AST generates a comparator for that as well. Since Person class has name and age defined, you would get comparatorByName and comparatorByAge methods, which return comparators for fields name and age respectively.

println people.sort(false, Person.comparatorByAge())

Laziness with streams API

Consider the following code example to find the first two even numbers from a list of numbers.

def numbers = [ 3, 5, 2, 1, 6, 8, 4]

def isEven = { number ->
	println "Checking if $number is even"
	number % 2 == 0
}
println numbers.findAll(isEven).take(2)

Output:

Checking if 3 is even
Checking if 5 is even
Checking if 2 is even
Checking if 1 is even
Checking if 6 is even
Checking if 8 is even
Checking if 4 is even
[2, 6]

Looking at the output, once could realise that by favouring modularity, we sacrificed performance. In fact there was no need to check if numbers 8 and 4 are even, because we already had the first two even numbers.

Interestingly, Java 8 provides a streams API, which is handy in such situations. Also Groovy plays well with streams API by allowing you to supply a closure, where you would pass a lambda expression (or a functional interface object), if you were coding in Java language.

println numbers.stream().filter(isEven).limit(2)
	.collect(java.util.stream.Collectors.toList())

Output:

Checking if 3 is even
Checking if 5 is even
Checking if 2 is even
Checking if 1 is even
Checking if 6 is even
[2, 6]

Now we have achieved efficiency, without sacrificing modularity.

Grouping

Grouping is a very common requirement in business applications. Suppose you have a list of numbers, which you wish to classify as even numbers and odd numbers. GDK provides groupBy method, which returns a map.

def numbers = [1, 2, 3, 4]
def numberGroups = numbers.groupBy { it % 2 }
println numberGroups // [1:[1, 3], 0:[2, 4]]
println "Even numbers "  + numberGroups[0] 
// Even numbers [2, 4]
println "Odd numbers "  + numberGroups[1]
// Odd numbers [1, 3]

Since mod 2 operation will result in either 0 or 1, the returned map has 2 keys - 0 and 1.

It is interesting to note that grouping operation can scale to multiple levels. Take a look at the following example, where we group retail outlets by state and city.

class Outlet{
	String name
	String city
	String state
	
	String toString(){ name }
}

def outlets = [
	new Outlet(name: 'Outlet1', city: 'Bengaluru', state: 'KA'),
	new Outlet(name: 'Outlet2', city: 'Mumbai', state: 'MH'),
	new Outlet(name: 'Outlet3', city: 'Mangalore', state: 'KA'),
	new Outlet(name: 'Outlet4', city: 'Bengaluru', state: 'KA')
]

def outletGroup = outlets.groupBy({it.state}, {it.city})
println outletGroup
// [KA:[Bengaluru:[Outlet1, Outlet4], Mangalore:[Outlet3]], MH:[Mumbai:[Outlet2]]]

List to Map

Map data structure is quite efficient when you want to lookup for an object based on a key. Using list in such case would degrade the performance from constant time complexity to linear time complexity. A common use case is your ORM layer returns a list of objects, which you want to convert to a map. The GDK method collectEntries can be used to achieve this.

class Employee{
	String employeeNumber
	String name
}

def employees = [
	new Employee(employeeNumber: '101', name: 'Raj'),
	new Employee(employeeNumber: '102', name: 'Reema'),
	new Employee(employeeNumber: '103', name: 'Anil')
]

def employeeByNumber = employees.collectEntries {
	[it.employeeNumber, it]
}
println employeeByNumber
// [101:Employee@6e20b53a, 102:Employee@71809907, 103:Employee@3ce1e309]
println employeeByNumber."102".name
// Reema

Splitting into sub lists and Combining

Suppose you have a list of numbers and you want to perform some operation on each of them through a web service. The web service can handle only few numbers at a time. Hence you would need to split the list into smaller sub lists and invoke the web service for each of the sub list, then finally merge the results into a single list. The following code example shows how simple it is in Groovy to achieve that. Instead of calling a web service, I have applied the transformation locally. I will use collate method to split a list to sublists and flatten to combine multiple lists into one.

def numbers = 1..10

def batches = numbers.collate(3)
println batches
// [[1, 2, 3], [4, 5, 6], [7, 8, 9], [10]]

def doubledBatch = batches.collect { it.collect { it * 2}}
println doubledBatch
// [[2, 4, 6], [8, 10, 12], [14, 16, 18], [20]]

def doubledNumbers = doubledBatch.flatten()
println doubledNumbers
// [2, 4, 6, 8, 10, 12, 14, 16, 18, 20]

Note that flatten can take any nested list and convert it into a flat list.

Conclusion

We have seen how Groovy provides out of the box solutions to most of the common coding tasks to deal with collections.