Multithreading in Python
In this tutorial, we’ll show you how to achieve parallelism in your code by using multithreading techniques in Python.
“Parallelism” and “multithreading” — what do these terms mean, and how do they relate? We’ll answer all your questions in this tutorial, including the following:
- What’s concurrency?
- What’s the difference between concurrency and parallelism?
- What’s the difference between processes and threads?
- What’s multithreading in Python?
I assume you know Python's fundamentals, including basic structures and functions.
What Is Concurrency?
Before we jump into the multithreading details, let’s compare concurrent programming to sequential programming.
In sequential computing, the components of a program are executed step-by-step to produce correct results; however, in concurrent computing, different program components are independent or semi-independent. Therefore, a processor can run components in different states independently and simultaneously. The main advantage of concurrency is in improving a program’s runtime, which means that since the processor runs the independent tasks simultaneously, less time is necessary for the processor to run the entire program and accomplish the main task.
What’s the Difference Between Concurrency and Parallelism?
Concurrency is a condition wherein two or more tasks can be initiated and completed in overlapping periods on a single processor and core. Parallelism is a condition wherein multiple tasks or distributed parts of a task run independently and simultaneously on multiple processors. So, parallelism isn’t possible on machines with a single processor and a single core.
Imagine two queues of customers; concurrency means a single cashier serves customers by switching between two queues. Parallelism means two cashiers simultaneously serve the two queues of customers.
What’s the Difference Between Processes and Threads?
A process is a program in execution with its own address space, memory, data stack, etc. The operating system allocates resources to the processes and manages the processes’ execution by assigning the CPU time to the different executing processes, according to any scheduling strategy.
Threads are similar to processes. However, they execute within the same process and share the same context. Therefore, sharing information or communicating with the other threads is more accessible than if they were separate processes.
Multithreading in Python
Python virtual machine is not a thread-safe interpreter, meaning that the interpreter can execute only one thread at any given moment. This limitation is enforced by the Python Global Interpreter Lock (GIL), which essentially limits one Python thread to run at a time. In other words, GIL ensures that only one thread runs within the same process at the same time on a single processor.
Basically, threading may not speed up all tasks. The I/O-bound tasks that spend much of their time waiting for external events have a better chance of taking advantage of threading than CPU-bound tasks.
NOTE
Python comes with two built-in modules for implementing multithreading programs, including the thread
, and threading
modules. The thread
and threading
modules provide useful features for creating and managing threads. However, in this tutorial, we'll focus on the threading
module, which is a much-improved, high-level module for implementing serious multithreading programs. Also, Python provides the Queue
module, allowing us to create a queue data structure to exchange information across multiple threads safely.