Java > Java 8 Features > Streams API > Parallel Streams
Parallel Stream Example: Summing a List of Numbers
This code snippet demonstrates how to use parallel streams to efficiently sum a large list of numbers. Parallel streams can significantly reduce execution time for computationally intensive operations by distributing the workload across multiple CPU cores.
Code Example
This code creates a list of 10 million numbers and then calculates the sum of these numbers using both a sequential stream and a parallel stream. The execution time of each approach is measured and printed to the console. Notice the use of `numbers.parallelStream()` to create a parallel stream. Also demonstrates `LongStream.rangeClosed` for an alternative method of generating the numbers in parallel.
import java.util.ArrayList;
import java.util.List;
import java.util.stream.LongStream;
public class ParallelStreamSum {
public static void main(String[] args) {
// Create a large list of numbers
List<Long> numbers = new ArrayList<>();
for (long i = 1; i <= 10_000_000; i++) {
numbers.add(i);
}
// Sequential Stream
long startTimeSequential = System.nanoTime();
long sumSequential = numbers.stream().mapToLong(Long::longValue).sum();
long endTimeSequential = System.nanoTime();
long durationSequential = (endTimeSequential - startTimeSequential) / 1_000_000;
System.out.println("Sequential Sum: " + sumSequential);
System.out.println("Sequential Time: " + durationSequential + " ms");
// Parallel Stream
long startTimeParallel = System.nanoTime();
long sumParallel = numbers.parallelStream().mapToLong(Long::longValue).sum();
long endTimeParallel = System.nanoTime();
long durationParallel = (endTimeParallel - startTimeParallel) / 1_000_000;
System.out.println("Parallel Sum: " + sumParallel);
System.out.println("Parallel Time: " + durationParallel + " ms");
//Alternative: Using LongStream.rangeClosed for generating numbers
long startTimeParallelLongStream = System.nanoTime();
long sumParallelLongStream = LongStream.rangeClosed(1, 10_000_000).parallel().sum();
long endTimeParallelLongStream = System.nanoTime();
long durationParallelLongStream = (endTimeParallelLongStream - startTimeParallelLongStream) / 1_000_000;
System.out.println("Parallel LongStream Sum: " + sumParallelLongStream);
System.out.println("Parallel LongStream Time: " + durationParallelLongStream + " ms");
}
}
Concepts Behind the Snippet
Parallel Streams: Java's parallel streams leverage the Fork/Join framework to divide a stream into smaller sub-streams that can be processed concurrently across multiple threads. This can lead to significant performance gains for CPU-bound operations on large datasets.
Fork/Join Framework: This framework is designed for parallel, recursive task decomposition. It divides a large task into smaller, independent subtasks, executes them in parallel, and then combines the results.
Stream API: Provides a functional programming approach to processing sequences of elements. Parallel streams are an extension of the Stream API designed for parallel execution.
Real-Life Use Case
Parallel streams are particularly useful in scenarios involving:
Best Practices
Interview Tip
When discussing parallel streams in an interview, be sure to mention the Fork/Join framework, potential performance benefits, and the importance of avoiding shared mutable state. Also, be prepared to discuss the trade-offs between sequential and parallel processing.
When to use them
Use Parallel Streams when:
Memory Footprint
Parallel streams can increase the memory footprint of your application because they require creating multiple threads and potentially copying data into sub-streams. Be mindful of memory usage, especially when dealing with extremely large datasets.
Alternatives
Alternatives to parallel streams include:
Pros
Cons
FAQ
-
When will a sequential stream be faster than a parallel stream?
A sequential stream can be faster than a parallel stream when the dataset is small, the operation is I/O-bound, or the overhead of parallelization outweighs the benefits of parallel execution. Also, if there's a lot of thread contention or synchronization overhead, sequential execution might be faster. -
How do I ensure thread safety when using parallel streams?
Avoid using shared mutable state within stream operations. If shared mutable state is unavoidable, use thread-safe data structures (e.g., `ConcurrentHashMap`, `AtomicInteger`) and synchronization mechanisms (e.g., locks, semaphores) to protect the data. -
Can I use parallel streams with any collection?
Yes, you can create a parallel stream from any `Collection` by calling the `parallelStream()` method. However, the performance benefits of parallel streams will vary depending on the data structure and the operation being performed.