Python tutorials > Advanced Python Concepts > Concurrency and Parallelism > What are threads (`threading`)?
What are threads (`threading`)?
Threads are lightweight, independent units of execution within a single process. The threading
module in Python provides a way to create and manage these threads. Threads share the same memory space, which allows them to communicate and share data more easily than processes. However, this shared memory space also necessitates careful synchronization to avoid race conditions and other concurrency issues.
Basic Thread Creation
This code demonstrates the basic creation and execution of threads.
First, we import the threading
and time
modules.
The worker
function is defined as the task that each thread will execute. It prints a message, simulates some work using time.sleep(1)
, and then prints a completion message.
A list called threads
is created to store the thread objects.
A loop iterates five times, creating a new thread in each iteration. threading.Thread(target=worker, args=(i,))
creates a new thread object where target
specifies the function to execute (worker
) and args
provides the arguments to that function (the worker number i
).
t.start()
starts the thread, causing it to execute the worker
function concurrently with the main thread.
Finally, t.join()
is called for each thread. This blocks the main thread until the specified thread has completed its execution. This ensures that the main program waits for all threads to finish before exiting.
import threading
import time
def worker(num):
"""Thread worker function"""
print(f'Worker: {num}')
time.sleep(1) # Simulate some work
print(f'Worker {num} finished')
threads = []
for i in range(5):
t = threading.Thread(target=worker, args=(i,))
threads.append(t)
t.start()
for t in threads:
t.join()
print("All threads finished.")
Concepts Behind the Snippet
The core concept here is concurrency. Threads allow multiple tasks to progress seemingly simultaneously within a single process. The threading.Thread
class is used to create new threads. The target
argument specifies the function that the thread will execute, and the args
argument provides the arguments to that function. The start()
method initiates the thread's execution, and the join()
method waits for the thread to complete. It's crucial to use join()
when you need the main thread to wait for worker threads to finish before proceeding.
Real-Life Use Case Section
Consider a web server handling multiple client requests. Instead of processing each request sequentially, the server can create a new thread for each request. This allows the server to handle multiple requests concurrently, improving responsiveness and overall performance. Another example is downloading multiple files simultaneously. Each download can be handled by a separate thread, speeding up the overall download process. GUI applications also often use threads to perform long-running tasks in the background, preventing the user interface from freezing.
Best Practices
threading.Lock
) or other synchronization primitives to prevent race conditions.
Interview Tip
When discussing threads in an interview, emphasize your understanding of concurrency, synchronization, and the limitations of the GIL. Be prepared to explain how you would handle race conditions and potential deadlocks. Provide concrete examples of when you would choose threads over processes, and vice-versa.
When to use them
Threads are well-suited for I/O-bound tasks where the threads spend most of their time waiting for external operations to complete (e.g., network requests, file I/O). They are less effective for CPU-bound tasks due to the GIL. In scenarios where shared memory access is frequent and efficient communication is needed, threads can be a good choice.
Memory Footprint
Threads have a smaller memory footprint compared to processes because they share the same memory space. This makes them more efficient in terms of memory usage when dealing with a large number of concurrent tasks. However, the shared memory space also requires careful management to avoid memory-related errors.
Alternatives
multiprocessing
): Uses multiple processes instead of threads, bypassing the GIL. Suitable for CPU-bound tasks.asyncio
): Uses a single thread event loop to handle multiple concurrent tasks. Suitable for I/O-bound tasks and provides a more lightweight alternative to threads.
Pros
Cons
FAQ
-
What is the Global Interpreter Lock (GIL)?
The GIL is a mutex that allows only one thread to hold control of the Python interpreter at any one time. This means that only one thread can execute Python bytecode at a time, even on multi-core processors. This limitation primarily affects CPU-bound tasks but has less impact on I/O-bound tasks. -
How do I prevent race conditions in multithreaded programs?
Use synchronization primitives like locks (threading.Lock
), semaphores (threading.Semaphore
), and condition variables (threading.Condition
) to protect shared resources from concurrent access. Ensure that only one thread can access a critical section of code at a time. -
What are daemon threads?
Daemon threads are background threads that automatically terminate when the main program exits. They are useful for tasks that are not essential for the program to complete gracefully, such as logging or monitoring. To create a daemon thread, set thedaemon
attribute of thethreading.Thread
object toTrue
.