Java > Java 8 Features > Streams API > Collectors

Calculating Average Salary by Department using Collectors.averagingInt and groupingBy

This snippet builds upon the previous example by demonstrating how to calculate the average salary for each department using Collectors.averagingInt in conjunction with Collectors.groupingBy.

Code Snippet

This code snippet reuses the Employee class from the previous example. It then creates a list of Employee objects. The key part is this: employees.stream().collect(Collectors.groupingBy(Employee::getDepartment, Collectors.averagingInt(Employee::getSalary))). Here, Collectors.groupingBy is used with two arguments: the first is the classifier function (Employee::getDepartment) and the second is a downstream collector (Collectors.averagingInt(Employee::getSalary)). The downstream collector calculates the average salary for each department. The result is a Map where the keys are the department names and the values are the average salaries for each department (as Double values since averagingInt returns a Double). Finally, the code prints the department name and the average salary for each department.

import java.util.Arrays;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;

class Employee {
    private String name;
    private String department;
    private int salary;

    public Employee(String name, String department, int salary) {
        this.name = name;
        this.department = department;
        this.salary = salary;
    }

    public String getName() {
        return name;
    }

    public String getDepartment() {
        return department;
    }

    public int getSalary() {
        return salary;
    }

    @Override
    public String toString() {
        return "Employee{" +
                "name='" + name + '\'' +
                ", department='" + department + '\'' +
                ", salary=" + salary +
                '}';
    }
}

public class GroupingByAverageSalary {
    public static void main(String[] args) {
        List<Employee> employees = Arrays.asList(
                new Employee("Alice", "Sales", 50000),
                new Employee("Bob", "Sales", 60000),
                new Employee("Charlie", "Engineering", 80000),
                new Employee("David", "Engineering", 90000),
                new Employee("Eve", "Marketing", 70000)
        );

        // Group employees by department and calculate the average salary
        Map<String, Double> averageSalaryByDepartment = employees.stream()
                .collect(Collectors.groupingBy(
                        Employee::getDepartment,
                        Collectors.averagingInt(Employee::getSalary)
                ));

        // Print the result
        averageSalaryByDepartment.forEach((department, averageSalary) -> {
            System.out.println("Department: " + department + ", Average Salary: " + averageSalary);
        });
    }
}

Concepts Behind the Snippet

This snippet combines two key concepts: grouping and aggregation. Collectors.groupingBy groups elements based on a classifier function, and the downstream collector (Collectors.averagingInt in this case) performs an aggregation operation on the elements within each group. This allows for powerful and concise data analysis.

Real-Life Use Case Section

In a sales context, you could use this approach to group sales by region and calculate the average sales amount per region. In a finance context, you could group transactions by category and calculate the average transaction amount per category.

Best Practices

  • Choose the appropriate averaging collector based on the data type of the attribute you are averaging (averagingInt for integers, averagingLong for longs, averagingDouble for doubles).
  • Consider using OptionalDouble to handle cases where a department has no employees (to avoid potential NullPointerException).
  • Be mindful of potential overflow issues when calculating averages with large integer values.

Interview Tip

Be prepared to explain the concept of downstream collectors and how they can be used to perform aggregation operations within each group. Also, be ready to discuss different types of averaging collectors and their use cases.

When to Use Them

Use this combination of groupingBy and averagingInt (or similar averaging collectors) when you need to calculate the average value of a specific attribute for each group of objects. It is particularly useful for generating summary reports and performing data analysis.

Memory Footprint

The memory footprint is similar to the previous example, but it also includes the memory required to store the average values for each group. The averaging collectors generally have a small memory overhead.

Alternatives

You could achieve the same result using a traditional loop and manually calculating the average salary for each department. However, the stream-based approach is often more concise and expressive.

Pros

  • Concise and readable code.
  • Automatic calculation of the average value for each group.
  • Can be easily combined with other stream operations.

Cons

  • Slight performance overhead compared to manual iteration for very simple averaging scenarios.

FAQ

  • What happens if a department has no employees?

    In the current code, if a department has no employees, the averagingInt collector will return 0.0 as the average salary. To handle this case more gracefully, you can use Collectors.mapping in combination with Collectors.toList to get a list of salaries and then calculate the average only if the list is not empty.
  • Can I use other aggregation functions besides averaging?

    Yes, you can use a variety of other aggregation functions as downstream collectors, such as Collectors.summingInt, Collectors.maxBy, Collectors.minBy, and Collectors.counting.