Java tutorials > Modern Java Features > Java 8 and Later > What is the Stream API and how to use it?

What is the Stream API and how to use it?

Introduction to the Stream API

The Stream API, introduced in Java 8, provides a powerful and efficient way to process collections of data. It allows you to perform complex operations on data in a declarative style, making your code more readable and maintainable. Streams are not data structures; they are pipelines that operate on data sources like collections, arrays, or I/O channels.

This tutorial will guide you through the basics of the Stream API, demonstrating its key features and providing practical examples.

Core Concepts of the Stream API

Key Concepts

The Stream API revolves around several core concepts:

  1. Streams: A sequence of elements supporting sequential and parallel aggregate operations.
  2. Intermediate Operations: Operations that transform or filter the stream (e.g., filter, map, sorted). These operations are lazy, meaning they are not executed until a terminal operation is invoked.
  3. Terminal Operations: Operations that produce a result or side-effect, triggering the processing of the stream (e.g., forEach, collect, count, reduce).
  4. Pipelines: A sequence of stream operations, starting with a source, followed by zero or more intermediate operations, and ending with a terminal operation.

Streams do not modify the original data source; instead, they create new streams or produce a result based on the original data.

Creating a Stream

Creating Streams

Streams can be created from various sources:

  1. From a List: Use the stream() method of the List interface.
  2. From an Array: Use the Arrays.stream() method.
  3. Using Stream.of(): Create a stream from individual elements.
  4. Using Stream.builder(): Create a stream using builder pattern which allows more flexible element adding.

The code snippet demonstrates different ways to create streams from common data sources.

import java.util.Arrays;
import java.util.List;
import java.util.stream.Stream;

public class StreamCreationExample {
    public static void main(String[] args) {
        // From a List
        List<String> names = Arrays.asList("Alice", "Bob", "Charlie");
        Stream<String> nameStream = names.stream();

        // From an Array
        String[] colors = {"Red", "Green", "Blue"};
        Stream<String> colorStream = Arrays.stream(colors);

        // Using Stream.of()
        Stream<Integer> numberStream = Stream.of(1, 2, 3, 4, 5);

        //Using Stream.builder()
        Stream.Builder<String> builder = Stream.builder();
        builder.add("Dog");
        builder.add("Cat");
        builder.add("Bird");
        Stream<String> animalStream = builder.build();
    }
}

Intermediate Operations: Filtering

Filtering with Streams

The filter() operation allows you to select elements from a stream that match a given predicate (a boolean-valued function). In this example, we filter names that start with the letter 'A'.

.filter(name -> name.startsWith("A")) keeps only the names starting with 'A'. The collect(Collectors.toList()) gathers the resulting elements into a new list.

import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;

public class StreamFilterExample {
    public static void main(String[] args) {
        List<String> names = Arrays.asList("Alice", "Bob", "Charlie", "Anna");

        // Filter names that start with 'A'
        List<String> filteredNames = names.stream()
                .filter(name -> name.startsWith("A"))
                .collect(Collectors.toList());

        System.out.println(filteredNames); // Output: [Alice, Anna]
    }
}

Intermediate Operations: Mapping

Mapping with Streams

The map() operation transforms each element of a stream into another element. It applies a function to each element and produces a new stream with the transformed elements. In this example, we convert names to uppercase.

.map(String::toUpperCase) applies the toUpperCase() method to each name. Again, collect(Collectors.toList()) collects the results into a list.

import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;

public class StreamMapExample {
    public static void main(String[] args) {
        List<String> names = Arrays.asList("Alice", "Bob", "Charlie");

        // Convert names to uppercase
        List<String> uppercaseNames = names.stream()
                .map(String::toUpperCase)
                .collect(Collectors.toList());

        System.out.println(uppercaseNames); // Output: [ALICE, BOB, CHARLIE]
    }
}

Intermediate Operations: Sorting

Sorting with Streams

The sorted() operation sorts the elements of a stream. By default, it sorts elements in natural order. You can also provide a custom comparator for more complex sorting scenarios.

.sorted() sorts the names alphabetically (natural order). The result is collected into a list.

import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;

public class StreamSortExample {
    public static void main(String[] args) {
        List<String> names = Arrays.asList("Charlie", "Alice", "Bob");

        // Sort names alphabetically
        List<String> sortedNames = names.stream()
                .sorted()
                .collect(Collectors.toList());

        System.out.println(sortedNames); // Output: [Alice, Bob, Charlie]
    }
}

Terminal Operations: forEach

Iterating with forEach()

The forEach() operation performs an action for each element of the stream. It's a terminal operation that consumes the stream.

.forEach(System.out::println) prints each name to the console.

import java.util.Arrays;
import java.util.List;

public class StreamForEachExample {
    public static void main(String[] args) {
        List<String> names = Arrays.asList("Alice", "Bob", "Charlie");

        // Print each name
        names.stream()
                .forEach(System.out::println);
        // Output:
        // Alice
        // Bob
        // Charlie
    }
}

Terminal Operations: collect

Collecting Results with collect()

The collect() operation accumulates the elements of a stream into a collection or other data structure. It uses a Collector interface to perform the accumulation. Common collectors are provided by the Collectors class.

The example shows how to collect the stream elements into a List and a Set.

import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;

public class StreamCollectExample {
    public static void main(String[] args) {
        List<String> names = Arrays.asList("Alice", "Bob", "Charlie");

        // Collect names into a new List
        List<String> nameList = names.stream()
                .collect(Collectors.toList());

        // Collect names into a Set
        java.util.Set<String> nameSet = names.stream()
                .collect(Collectors.toSet());

        System.out.println("List: " + nameList);
        System.out.println("Set: " + nameSet);
    }
}

Terminal Operations: reduce

Reducing a Stream with reduce()

The reduce() operation combines the elements of a stream into a single value. It takes an identity value (initial value) and an accumulator function that combines two elements into one.

.reduce(0, Integer::sum) starts with an initial value of 0 and adds each number in the stream to the running sum.

import java.util.Arrays;
import java.util.List;

public class StreamReduceExample {
    public static void main(String[] args) {
        List<Integer> numbers = Arrays.asList(1, 2, 3, 4, 5);

        // Calculate the sum of the numbers
        int sum = numbers.stream()
                .reduce(0, Integer::sum);

        System.out.println("Sum: " + sum); // Output: Sum: 15
    }
}

Real-Life Use Case: Processing Orders

Processing Orders with Streams

Consider a scenario where you need to process a list of orders. You can use streams to filter orders based on customer, calculate total amounts, and perform other complex operations.

The example demonstrates how to filter orders for a specific customer, map them to their amounts, and then sum the amounts to get the total spent by that customer. Also, it gets the distinct list of customers whose order is greater than 100.

import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;

class Order {
    String customer;
    double amount;

    public Order(String customer, double amount) {
        this.customer = customer;
        this.amount = amount;
    }

    public String getCustomer() {
        return customer;
    }

    public double getAmount() {
        return amount;
    }

    @Override
    public String toString() {
        return "Order{customer='" + customer + "', amount=" + amount + "}";
    }
}

public class StreamOrderProcessing {
    public static void main(String[] args) {
        List<Order> orders = Arrays.asList(
                new Order("Alice", 120.0),
                new Order("Bob", 80.0),
                new Order("Alice", 50.0),
                new Order("Charlie", 200.0)
        );

        // Calculate the total amount spent by Alice
        double totalAlice = orders.stream()
                .filter(order -> order.getCustomer().equals("Alice"))
                .mapToDouble(Order::getAmount)
                .sum();

        System.out.println("Total spent by Alice: " + totalAlice); // Output: Total spent by Alice: 170.0

        //Get customers whose orders are greater than 100
        List<String> highValueCustomers = orders.stream()
                .filter(order -> order.getAmount() > 100)
                .map(Order::getCustomer)
                .distinct()
                .collect(Collectors.toList());

        System.out.println("High value customers: " + highValueCustomers); // Output: [Alice, Charlie]

    }
}

Best Practices

Best Practices for Using Streams

  • Use Streams for Data Processing: Streams are ideal for processing collections of data in a declarative and efficient manner.
  • Avoid Side Effects in Intermediate Operations: Intermediate operations should be stateless and avoid side effects. They should only transform or filter the stream.
  • Keep Pipelines Short: Long pipelines can be difficult to read and debug. Break them down into smaller, more manageable parts if necessary.
  • Consider Parallel Streams for Performance: For large datasets, consider using parallel streams to leverage multi-core processors. Use with caution, as parallel streams introduce additional complexity.
  • Choose the Right Terminal Operation: Select the appropriate terminal operation based on the desired result (e.g., collect for creating a new collection, forEach for performing an action on each element, reduce for combining elements).

When to Use Streams

When to Use the Stream API

The Stream API is most beneficial when:

  • You need to perform complex operations on collections of data.
  • You want to write declarative code that is easy to read and understand.
  • You need to process large datasets efficiently (potentially using parallel streams).

However, streams may not be the best choice for simple iteration or when performance is critical and every millisecond counts. In those cases, traditional loops may be more appropriate.

Memory Footprint

Memory Considerations

Streams, being lazy, generally have a low memory footprint. They don't store the data itself but rather operate on the data source. Intermediate operations create new streams, but these are also lazy and don't materialize the data until a terminal operation is called.

However, terminal operations like collect() can have a significant memory footprint, especially when collecting large datasets into a new collection. Be mindful of the size of the data you are collecting and choose appropriate data structures to minimize memory usage.

Alternatives to Stream API

Alternatives to Stream API

Before Java 8, traditional loops (for, while) were the primary way to process collections of data. Other alternatives include:

  • External Libraries: Libraries like Guava provide utility classes and methods for working with collections, but they don't offer the same level of declarative programming and parallel processing capabilities as the Stream API.
  • Functional Programming Libraries: Libraries like Vavr (formerly Javaslang) offer more comprehensive functional programming features, including immutable data structures and functional data processing pipelines.

The Stream API provides a good balance of ease of use, performance, and functional programming capabilities for most data processing tasks in Java.

Pros and Cons

Pros and Cons of Using Streams

Pros:

  • Declarative Style: Streams allow you to express complex operations in a clear and concise manner.
  • Improved Readability: Stream pipelines are generally easier to read and understand than traditional loops.
  • Parallel Processing: Streams can be easily parallelized to leverage multi-core processors.
  • Lazy Evaluation: Intermediate operations are lazy, which can improve performance by avoiding unnecessary computations.
  • Functional Programming: Encourages the use of functional programming principles, leading to more maintainable code.

Cons:

  • Learning Curve: The Stream API can take some time to learn and master, especially for developers unfamiliar with functional programming concepts.
  • Debugging Complexity: Debugging stream pipelines can be more challenging than debugging traditional loops.
  • Potential Performance Overhead: For simple operations, the overhead of creating and processing streams may outweigh the benefits.
  • Not Suitable for All Tasks: Streams are not always the best choice for every data processing task. Simple iterations or tasks that require mutable state may be better suited for traditional loops.

Interview Tip

Interview Tip

When discussing the Stream API in an interview, be prepared to explain the core concepts, such as streams, intermediate operations, and terminal operations. Be able to provide examples of common stream operations like filter, map, sorted, forEach, collect, and reduce. Also, be ready to discuss the benefits and drawbacks of using streams compared to traditional loops. Understanding the concept of lazy evaluation and parallel processing with streams is also beneficial.

FAQ

  • What is the difference between a Stream and a Collection?

    Streams are not data structures like Collections. A Stream is a sequence of elements that supports aggregate operations, while a Collection is a data structure that stores elements. Streams are processed in a lazy manner, while Collections hold data in memory.
  • Can I reuse a Stream after a terminal operation?

    No, a Stream can only be used once. After a terminal operation is performed, the Stream is considered consumed, and you cannot reuse it. You'll need to create a new Stream from the data source if you want to perform another operation.
  • How do I handle exceptions in a Stream pipeline?

    You can handle exceptions within a Stream pipeline using try-catch blocks within the lambda expressions used in intermediate or terminal operations. However, this can make the code less readable. An alternative is to wrap the potentially exception-throwing operation in a function that returns an Optional and then use flatMap to handle the presence or absence of a value.