Java tutorials > Core Java Fundamentals > Data Structures and Collections > What are thread-safe Collection classes?

What are thread-safe Collection classes?

In concurrent programming, thread safety is crucial when multiple threads access and modify shared data. Java provides several thread-safe Collection classes to help manage data safely in multithreaded environments. These collections provide internal mechanisms to handle concurrent access, preventing data corruption and race conditions. This tutorial will explore these classes and their usage.

Introduction to Thread-Safe Collections

Thread-safe collections are designed to be accessed and modified by multiple threads concurrently without compromising data integrity. Java's java.util.concurrent package offers several implementations that provide thread safety through techniques like synchronization, locking, and atomic operations.

ConcurrentHashMap

ConcurrentHashMap is a thread-safe implementation of the Map interface. It provides high concurrency by dividing the map into segments and allowing multiple threads to operate on different segments simultaneously. Instead of locking the entire map, it uses finer-grained locking at the segment level, improving performance.

  • The example demonstrates two threads adding elements to the ConcurrentHashMap concurrently.
  • t1 and t2 threads each add 1000 key-value pairs to the map.
  • The join() method ensures that the main thread waits for both t1 and t2 to complete before printing the size of the map.

import java.util.concurrent.ConcurrentHashMap;

public class ConcurrentHashMapExample {
    public static void main(String[] args) throws InterruptedException {
        ConcurrentHashMap<String, Integer> map = new ConcurrentHashMap<>();

        // Thread 1: Add elements
        Thread t1 = new Thread(() -> {
            for (int i = 0; i < 1000; i++) {
                map.put("Thread1_Key" + i, i);
            }
        });

        // Thread 2: Add elements
        Thread t2 = new Thread(() -> {
            for (int i = 0; i < 1000; i++) {
                map.put("Thread2_Key" + i, i);
            }
        });

        t1.start();
        t2.start();

        t1.join();
        t2.join();

        System.out.println("Size of ConcurrentHashMap: " + map.size()); // Expected output: 2000
    }
}

CopyOnWriteArrayList

CopyOnWriteArrayList is a thread-safe variant of ArrayList. All mutative operations (add, set, remove, etc.) are implemented by making a fresh copy of the underlying array. This ensures that iterators are always consistent and don't throw ConcurrentModificationException.

  • The example showcases two threads adding elements to the CopyOnWriteArrayList concurrently.
  • t1 and t2 threads each add 100 elements to the list.
  • Reading operations are very fast since they operate on the existing array.

import java.util.List;
import java.util.concurrent.CopyOnWriteArrayList;

public class CopyOnWriteArrayListExample {
    public static void main(String[] args) throws InterruptedException {
        CopyOnWriteArrayList<String> list = new CopyOnWriteArrayList<>();

        // Thread 1: Add elements
        Thread t1 = new Thread(() -> {
            for (int i = 0; i < 100; i++) {
                list.add("Thread1_Element" + i);
            }
        });

        // Thread 2: Add elements
        Thread t2 = new Thread(() -> {
            for (int i = 0; i < 100; i++) {
                list.add("Thread2_Element" + i);
            }
        });

        t1.start();
        t2.start();

        t1.join();
        t2.join();

        System.out.println("Size of CopyOnWriteArrayList: " + list.size()); // Expected output: 200
    }
}

ConcurrentLinkedQueue

ConcurrentLinkedQueue is an unbounded thread-safe queue based on linked nodes. It provides non-blocking operations, meaning that operations like offer (add) and poll (remove) are designed to be fast and not block the calling thread.

  • The example illustrates two threads adding elements to the ConcurrentLinkedQueue concurrently.
  • t1 and t2 threads each add 100 elements to the queue.
  • The queue is unbounded, meaning it can grow dynamically.

import java.util.Queue;
import java.util.concurrent.ConcurrentLinkedQueue;

public class ConcurrentLinkedQueueExample {
    public static void main(String[] args) throws InterruptedException {
        Queue<String> queue = new ConcurrentLinkedQueue<>();

        // Thread 1: Add elements
        Thread t1 = new Thread(() -> {
            for (int i = 0; i < 100; i++) {
                queue.offer("Thread1_Element" + i);
            }
        });

        // Thread 2: Add elements
        Thread t2 = new Thread(() -> {
            for (int i = 0; i < 100; i++) {
                queue.offer("Thread2_Element" + i);
            }
        });

        t1.start();
        t2.start();

        t1.join();
        t2.join();

        System.out.println("Size of ConcurrentLinkedQueue: " + queue.size()); // Expected output: 200
    }
}

BlockingQueue Interfaces and Implementations

BlockingQueue is an interface that represents a queue that is thread-safe and supports blocking operations. Implementations include ArrayBlockingQueue, LinkedBlockingQueue, PriorityBlockingQueue, and DelayQueue.

  • ArrayBlockingQueue: A bounded blocking queue backed by an array.
  • LinkedBlockingQueue: An optionally bounded blocking queue based on linked nodes.
  • PriorityBlockingQueue: An unbounded blocking queue that orders elements according to their priority.
  • DelayQueue: A blocking queue in which elements are delayed until a given delay has expired.

The example demonstrates ArrayBlockingQueue with a producer and consumer thread. The put() method blocks if the queue is full, and the take() method blocks if the queue is empty.

import java.util.concurrent.ArrayBlockingQueue;
import java.util.concurrent.BlockingQueue;

public class ArrayBlockingQueueExample {
    public static void main(String[] args) throws InterruptedException {
        BlockingQueue<String> queue = new ArrayBlockingQueue<>(10);

        // Producer Thread
        Thread producer = new Thread(() -> {
            try {
                for (int i = 0; i < 20; i++) {
                    queue.put("Element" + i); // Blocks if queue is full
                    System.out.println("Produced: Element" + i);
                    Thread.sleep(100); // Simulate production time
                }
            } catch (InterruptedException e) {
                Thread.currentThread().interrupt();
            }
        });

        // Consumer Thread
        Thread consumer = new Thread(() -> {
            try {
                for (int i = 0; i < 20; i++) {
                    String element = queue.take(); // Blocks if queue is empty
                    System.out.println("Consumed: " + element);
                    Thread.sleep(200); // Simulate consumption time
                }
            } catch (InterruptedException e) {
                Thread.currentThread().interrupt();
            }
        });

        producer.start();
        consumer.start();

        producer.join();
        consumer.join();

        System.out.println("Queue is empty: " + queue.isEmpty());
    }
}

Concepts behind the snippet

Concurrency: Allows multiple tasks to execute simultaneously, improving application responsiveness and throughput.

Synchronization: Mechanisms like locks ensure that only one thread can access a critical section of code at a time, preventing data corruption.

Atomic Operations: Operations that execute as a single, indivisible unit, preventing race conditions.

Immutability: Creating immutable objects (objects whose state cannot be changed after creation) inherently avoids concurrency issues.

Lock Striping: The technique of using multiple locks to guard different parts of a data structure, allowing concurrent access to different regions.

Real-Life Use Case Section

High-Performance Servers: Thread-safe collections are essential in servers handling multiple client requests concurrently. For example, a web server might use a ConcurrentHashMap to store session data or a BlockingQueue to manage incoming requests.

Data Processing Pipelines: In data processing applications, multiple threads might process data concurrently. A ConcurrentLinkedQueue or BlockingQueue can be used to pass data between processing stages.

Caching: A caching system can use ConcurrentHashMap to store frequently accessed data, allowing multiple threads to retrieve data concurrently.

Best Practices

Choose the Right Collection: Select the thread-safe collection that best fits your application's needs. Consider factors like the frequency of reads and writes, the number of threads accessing the collection, and whether blocking operations are required.

Minimize Locking: Avoid holding locks for extended periods, as this can reduce concurrency. Use lock striping or other techniques to reduce lock contention.

Use Atomic Operations: When possible, use atomic operations to update shared variables, as they are more efficient and less error-prone than explicit locking.

Consider Immutability: If possible, use immutable objects to avoid concurrency issues altogether.

Interview Tip

Be prepared to discuss the differences between various thread-safe collections and their performance characteristics. Understand the trade-offs between different synchronization mechanisms, such as locks, atomic operations, and copy-on-write strategies.

Common questions include: When would you use a ConcurrentHashMap vs. a Hashtable? What are the benefits and drawbacks of using CopyOnWriteArrayList? How does a BlockingQueue work, and what are its use cases?

When to use them

Use thread-safe collections when:

  • Multiple threads access and modify a collection concurrently.
  • Data integrity is critical.
  • You need to avoid race conditions and data corruption.

Avoid thread-safe collections when:

  • The collection is only accessed by a single thread.
  • The performance overhead of thread safety is unacceptable.

Memory footprint

The memory footprint of thread-safe collections depends on the specific implementation.

  • ConcurrentHashMap has a moderate memory overhead due to its segment-based structure.
  • CopyOnWriteArrayList has a high memory overhead because it creates a new copy of the underlying array for each mutative operation.
  • ConcurrentLinkedQueue has a moderate memory overhead due to its linked node structure.
  • ArrayBlockingQueue has a fixed memory footprint determined by its capacity.
  • LinkedBlockingQueue's memory footprint depends on the number of elements it contains, but it can grow without bound (if unbounded).

Alternatives

Alternatives to thread-safe collections include:

  • External Synchronization: Using the synchronized keyword or explicit locks to guard access to non-thread-safe collections.
  • Immutable Collections: Using immutable collections, which are inherently thread-safe.
  • Thread-Local Variables: Storing a separate copy of the collection in each thread using ThreadLocal.

Pros

Thread Safety: Guarantee data integrity in multithreaded environments.

High Concurrency: Allow multiple threads to access and modify the collection concurrently.

Simplified Development: Reduce the complexity of concurrent programming by providing built-in synchronization mechanisms.

Cons

Performance Overhead: Introduce performance overhead due to synchronization mechanisms.

Increased Memory Usage: May require more memory due to additional data structures or copy-on-write strategies.

Complexity: Can be more complex to understand and use than non-thread-safe collections.

FAQ

  • What is the difference between ConcurrentHashMap and Hashtable?

    ConcurrentHashMap provides better concurrency than Hashtable. Hashtable synchronizes the entire map, while ConcurrentHashMap uses finer-grained locking (segment locking), allowing multiple threads to access different parts of the map concurrently.
  • When should I use CopyOnWriteArrayList?

    Use CopyOnWriteArrayList when you have a list that is frequently read but infrequently modified. It is useful when iterating over the list while other threads are modifying it, as it avoids ConcurrentModificationException.
  • What is a BlockingQueue and how does it work?

    BlockingQueue is a thread-safe queue that supports blocking operations. It allows threads to wait (block) until an element is available to be retrieved (take()) or until space is available to add an element (put()). This is useful for implementing producer-consumer patterns.