Groovy: Working with Collections - Part 2
Abstract
Groovy is a dynamic language on Java platform. Groovy provides extensive support for working with collection, with native support for list and map literals.In this article, I will explore the options for working with collections effectively. We will explore various collection types, internal iterators, map reduce methods, method chaining to using Java 8 streams API. Also we would see how Java collections get that extra power when they enter the Groovy world.
Sorting
Let's take a look at simple case of sorting numbers.
def numbers = [2, 3, 1, 4, 5]
def numbersAscending = numbers.sort()
println numbersAscending // [1, 2, 3, 4, 5]
println numbers // [1, 2, 3, 4, 5]
Well, that is something which you could achieve with plain Java too. My intention here is to bring out the not so good things about this code and show you what is in Groovy to make it better.
If you observe the output produced by the above code, invoking sort method returned a list object with numbers in ascending order. However unfortunately, it modifies the original list as well, thus you end up loosing your original collection. But when java lists enter the world of Groovy, they get another flavour of sort method, which accepts a boolean parameter. If you pass a false, a new list will be created for you, instead of modifying the original one.
def numbers = [2, 3, 1, 4, 5]
def numbersAscending = numbers.sort(false)
println numbersAscending // [1, 2, 3, 4, 5]
println numbers // [2, 3, 1, 4, 5]
Let's move to a slightly complex task of sorting objects. Consider the following example
import groovy.transform.ToString
@ToString
class Person implements Comparable<Person>{
String name
int age
int compareTo(Person another){
name <=> another.name
}
}
def people = [
new Person(name: 'Mark', age: 30),
new Person(name: 'Raj', age: 25),
new Person(name: 'Ajay', age: 35),
new Person(name: 'Mark', age: 20)
]
println people.sort(false)
// [Person(Ajay, 35), Person(Mark, 30), Person(Mark, 20), Person(Raj, 25)]
The sort method depends on the implementation of Comparable interface to determine the order. In this case, we decided the default ordering to be based on the name property as indicated by the method compareTo. If there is a need to order person objects based on the age, then we need to have a corresponding Comparator object. We will use spaceship operator <=> to simplify our comparator implementation.
def ageComparator = { Person one, Person another ->
one.age <=> another.age
}
println people.sort(false, ageComparator)
// [Person(Mark, 20), Person(Raj, 25), Person(Mark, 30), Person(Ajay, 35)]
Now let's attempt to order person objects by name and age, which means if two people have the name name, the person who is younger should appear first. We will achieve this by building another comparator.
def nameAndAgeComparator = { Person one, Person another ->
[{it.name}, {it.age}].findResult { fieldExtractor ->
fieldExtractor(one) <=> fieldExtractor(another) ?: null
}
}
println people.sort(false, nameAndAgeComparator)
// [Person(Ajay, 35), Person(Mark, 20), Person(Mark, 30), Person(Raj, 25)]
If you using Groovy 2.4 or above, this can be simpler. By applying groovy.transform.Sortable annotation on the class, you get Comparable implementation automatically.
import groovy.transform.*
@ToString
@Sortable
class Person {
String name
int age
}
def people = [
new Person(name: 'Mark', age: 30),
new Person(name: 'Raj', age: 25),
new Person(name: 'Ajay', age: 35),
new Person(name: 'Mark', age: 20)
]
println people.sort(false)
// [Person(Ajay, 35), Person(Mark, 20), Person(Mark, 30), Person(Raj, 25)]
Since we declared name first and then age, sorting will also follow that order. Had we declared age before name, the auto generated comparator would consider age first and then name. You could influence the Comparable implementation by specifying includes or excludes attributes with Sortable annotation.
Additionally, if you want to sort by age alone, Sortable AST generates a comparator for that as well. Since Person class has name and age defined, you would get comparatorByName and comparatorByAge methods, which return comparators for fields name and age respectively.
println people.sort(false, Person.comparatorByAge())
Laziness with streams API
Consider the following code example to find the first two even numbers from a list of numbers.
def numbers = [ 3, 5, 2, 1, 6, 8, 4]
def isEven = { number ->
println "Checking if $number is even"
number % 2 == 0
}
println numbers.findAll(isEven).take(2)
Output:
Checking if 3 is even
Checking if 5 is even
Checking if 2 is even
Checking if 1 is even
Checking if 6 is even
Checking if 8 is even
Checking if 4 is even
[2, 6]
Looking at the output, once could realise that by favouring modularity, we sacrificed performance. In fact there was no need to check if numbers 8 and 4 are even, because we already had the first two even numbers.
Interestingly, Java 8 provides a streams API, which is handy in such situations. Also Groovy plays well with streams API by allowing you to supply a closure, where you would pass a lambda expression (or a functional interface object), if you were coding in Java language.
println numbers.stream().filter(isEven).limit(2)
.collect(java.util.stream.Collectors.toList())
Output:
Checking if 3 is even
Checking if 5 is even
Checking if 2 is even
Checking if 1 is even
Checking if 6 is even
[2, 6]
Now we have achieved efficiency, without sacrificing modularity.
Grouping
Grouping is a very common requirement in business applications. Suppose you have a list of numbers, which you wish to classify as even numbers and odd numbers. GDK provides groupBy method, which returns a map.
def numbers = [1, 2, 3, 4]
def numberGroups = numbers.groupBy { it % 2 }
println numberGroups // [1:[1, 3], 0:[2, 4]]
println "Even numbers " + numberGroups[0]
// Even numbers [2, 4]
println "Odd numbers " + numberGroups[1]
// Odd numbers [1, 3]
Since mod 2 operation will result in either 0 or 1, the returned map has 2 keys - 0 and 1.
It is interesting to note that grouping operation can scale to multiple levels. Take a look at the following example, where we group retail outlets by state and city.
class Outlet{
String name
String city
String state
String toString(){ name }
}
def outlets = [
new Outlet(name: 'Outlet1', city: 'Bengaluru', state: 'KA'),
new Outlet(name: 'Outlet2', city: 'Mumbai', state: 'MH'),
new Outlet(name: 'Outlet3', city: 'Mangalore', state: 'KA'),
new Outlet(name: 'Outlet4', city: 'Bengaluru', state: 'KA')
]
def outletGroup = outlets.groupBy({it.state}, {it.city})
println outletGroup
// [KA:[Bengaluru:[Outlet1, Outlet4], Mangalore:[Outlet3]], MH:[Mumbai:[Outlet2]]]
List to Map
Map data structure is quite efficient when you want to lookup for an object based on a key. Using list in such case would degrade the performance from constant time complexity to linear time complexity. A common use case is your ORM layer returns a list of objects, which you want to convert to a map. The GDK method collectEntries can be used to achieve this.
class Employee{
String employeeNumber
String name
}
def employees = [
new Employee(employeeNumber: '101', name: 'Raj'),
new Employee(employeeNumber: '102', name: 'Reema'),
new Employee(employeeNumber: '103', name: 'Anil')
]
def employeeByNumber = employees.collectEntries {
[it.employeeNumber, it]
}
println employeeByNumber
// [101:Employee@6e20b53a, 102:Employee@71809907, 103:Employee@3ce1e309]
println employeeByNumber."102".name
// Reema
Splitting into sub lists and Combining
Suppose you have a list of numbers and you want to perform some operation on each of them through a web service. The web service can handle only few numbers at a time. Hence you would need to split the list into smaller sub lists and invoke the web service for each of the sub list, then finally merge the results into a single list. The following code example shows how simple it is in Groovy to achieve that. Instead of calling a web service, I have applied the transformation locally. I will use collate method to split a list to sublists and flatten to combine multiple lists into one.
def numbers = 1..10
def batches = numbers.collate(3)
println batches
// [[1, 2, 3], [4, 5, 6], [7, 8, 9], [10]]
def doubledBatch = batches.collect { it.collect { it * 2}}
println doubledBatch
// [[2, 4, 6], [8, 10, 12], [14, 16, 18], [20]]
def doubledNumbers = doubledBatch.flatten()
println doubledNumbers
// [2, 4, 6, 8, 10, 12, 14, 16, 18, 20]
Note that flatten can take any nested list and convert it into a flat list.
Conclusion
We have seen how Groovy provides out of the box solutions to most of the common coding tasks to deal with collections.