What is Multithreading?
From proper rendering of GUI applications to online multiplayer video games, multithreading is everywhere, but what is it?
Multithreading is a programming technique that allows multiple threads to perform tasks over overlapping time periods. On systems with multiple CPU cores, those threads may also execute in parallel, allowing for different tasks to be completed simultaneously.
Going deeper, what is a thread? Just like a second is the simplest unit of measurement for time, a thread is the simplest unit of execution that can be scheduled by the operating system. Each thread follows its own control flow through a program’s instructions.
Threads belong to a process (an instance of an application or program) and are scheduled onto CPU cores.
Most applications we practice building in intro computer programming classes are single-threaded. When we type python run script.py on the command line, the operating system schedules one thread to run that particular script. If we want to start another script or run another task, the OS scheduler can pause a thread and switch to another.
However, if we have multiple moving components, multithreading is necessary. For example, in a browser, when we search something on Google, one thread is allocated to the client/user side so the user can still move their mouse around while another is allocated to query and return the search results. These threads work concurrently — and in parallel/simultaneously on multi-core systems — so that the user's mouse movement and the backend query don't block each other.
In programming (and in practice), we typically don’t allocate individual lines to a thread. It doesn’t make sense to tell the thread to “run lines 100 through 132”, so we allocate those lines to a standalone function and utilize the thread to run that particular function. Multithreading allows us to schedule and run multiple functions at once.
Utilizing different threads together isn’t always simple, though. If you have two threads trying to access the same data at the same time, you’ll get a race condition that can lead to buggy and corrupted programs. One standard way to resolve race conditions is with locks, but these are topics for another time.
Practice problems:
We’ve talked about how multithreading applications can work in the context of browser applications. Now imagine a multiplayer game server hosting arenas with multiple players. Which tasks could run in separate threads? For example: player input processing, physics updates, matchmaking, and networking.
A step further, how might multithreading occur with training a large language model (LLM) on a GPU with many cores? What parts of the LLM could we allocate to individual threads?
Write a Python program that attempts to perform a CPU-bound task using multiple threads. Observe how the GIL (Global Interpreter Lock) prevents true parallel execution.
OpenMP (Open Multi-Processing) is a popular API for concurrent programming in C++. Write a C++ function to compute the sum of the first N natural numbers. Measure runtime with and without OpenMP parallelization.
Solutions to previous problems: https://github.com/jonesy346/substack-infra-problems
I'd love any feedback you have — was this clear and useful?




