Exploring Parallel Tracks (Threading and Asyncio Engines)
๐ท๏ธ Final Capstone Engineer Script project / Next Steps After This Curriculum
๐งญ Context Introduction
As you move beyond the basics of Python scripting, you'll encounter situations where your scripts need to do multiple things at once. Imagine downloading several files from a server, processing multiple log files simultaneously, or handling many network connections at the same time. This is where parallel execution comes into play.
Python offers two main engines for running tasks concurrently: Threading and Asyncio. While they both help you achieve parallelism, they work in fundamentally different ways. Understanding when to use each will make your scripts faster, more efficient, and more responsive.
โ๏ธ What is Threading?
Threading allows your script to run multiple operations at the same time by splitting work into separate threads. Each thread runs independently but shares the same memory space.
Key characteristics: - Threads run in parallel but are limited by Python's Global Interpreter Lock (GIL) for CPU-bound tasks - Best suited for I/O-bound tasks like reading files, making network requests, or waiting for database queries - Threads can share data easily, but you need to be careful with race conditions - Threading is built into Python's standard library with the threading module
When to use threading: - Downloading multiple files from the internet - Reading and writing many small files at once - Handling multiple user connections in a server - Making several API calls simultaneously
๐ What is Asyncio?
Asyncio is a newer approach that uses a single thread to manage multiple tasks by switching between them when they are waiting. Think of it as a chef who starts cooking one dish, then while it simmers, starts chopping vegetables for another dish, then returns to the first dish.
Key characteristics: - Runs everything on a single thread using cooperative multitasking - Tasks voluntarily yield control when they are waiting (I/O operations) - Extremely efficient for handling thousands of concurrent connections - Uses the asyncio module and the async/await syntax
When to use asyncio: - Building web servers or API clients - Handling many network connections simultaneously - Running tasks that spend most of their time waiting - Creating responsive command-line tools
๐ ๏ธ Core Differences Between Threading and Asyncio
| Feature | Threading | Asyncio |
|---|---|---|
| Execution model | Multiple threads running in parallel | Single thread with cooperative switching |
| Best for | I/O-bound tasks with blocking operations | I/O-bound tasks with many concurrent operations |
| CPU-bound tasks | Limited by GIL (not ideal) | Not suitable (single thread) |
| Complexity | Moderate (need locks and synchronization) | Lower (no race conditions on shared data) |
| Memory usage | Higher (each thread has its own stack) | Lower (single stack for all tasks) |
| Learning curve | Easier to understand initially | Requires understanding async/await pattern |
| Library support | Works with most Python libraries | Requires async-compatible libraries |
๐งช Simple Example: Threading in Action
Imagine you need to check if five different servers are online. With threading, you can check all five at once instead of waiting for each one sequentially.
How it works: - You create a Thread object for each server check - Each thread runs the check function independently - The main script waits for all threads to finish using join() - Results are collected in a shared list
Basic pattern: - Import the threading module - Define a function that does the work - Create thread objects with Thread(target=your_function, args=(...)) - Start each thread with start() - Wait for completion with join()
๐ Simple Example: Asyncio in Action
Now consider the same server-checking task using asyncio. Instead of creating threads, you create coroutines that can pause and resume.
How it works: - You define an async function (coroutine) for each server check - You gather all coroutines using asyncio.gather() - The event loop switches between coroutines when they are waiting - Everything runs on a single thread
Basic pattern: - Import the asyncio module - Define an async function with async def - Use await inside the function for I/O operations - Run the main coroutine with asyncio.run()
๐ฏ Choosing the Right Engine
Choose Threading when: - You are working with existing libraries that are not async-compatible - You need to run blocking operations like file I/O or database queries - Your tasks are relatively few (tens to hundreds) - You want simpler error handling and debugging
Choose Asyncio when: - You are building applications with many concurrent connections (hundreds to thousands) - You are using async-compatible libraries like aiohttp or asyncpg - You want better performance with lower memory overhead - You are writing new code from scratch and can design for async
๐ต๏ธ Common Pitfalls to Avoid
With Threading: - Forgetting to use locks when multiple threads modify shared data - Creating too many threads (each thread consumes memory) - Assuming threads speed up CPU-heavy calculations - Not handling exceptions within threads properly
With Asyncio: - Using blocking functions inside async code (this blocks the entire event loop) - Forgetting to use await when calling async functions - Mixing threading and asyncio without understanding the implications - Creating too many coroutines without proper resource management
๐ Practical Advice for Engineers
Start with threading if you are new to parallel programming. It is more intuitive and works with almost any Python library you already know. Once you are comfortable with threading, explore asyncio for scenarios where you need to handle many concurrent operations efficiently.
A good rule of thumb: if your script spends most of its time waiting (network calls, disk I/O, API requests), both threading and asyncio can help. If you need to handle thousands of connections, asyncio is the better choice. If you are working with existing blocking libraries, threading is more practical.
Remember that neither threading nor asyncio will speed up pure CPU calculations in Python. For CPU-bound tasks, you would need to explore multiprocessing, which is a separate topic beyond this curriculum.
๐ Next Steps
- Practice converting a sequential script that makes multiple API calls into a threaded version
- Try rewriting the same script using asyncio with an async HTTP library
- Experiment with mixing both approaches in a single script (advanced)
- Explore the concurrent.futures module for a higher-level threading interface
Both threading and asyncio are powerful tools in your Python toolkit. Mastering them will allow you to write scripts that are faster, more responsive, and better suited for real-world engineering challenges.
This section shows two ways to run multiple tasks at the same time in Python โ threading for I/O-bound work and asyncio for cooperative multitasking.
๐งต Example 1: Running two functions with threading
This example shows how to run two separate functions at the same time using threads.
import threading
import time
def task_one():
print("Task 1 started")
time.sleep(2)
print("Task 1 finished")
def task_two():
print("Task 2 started")
time.sleep(2)
print("Task 2 finished")
thread1 = threading.Thread(target=task_one)
thread2 = threading.Thread(target=task_two)
thread1.start()
thread2.start()
thread1.join()
thread2.join()
print("Both tasks done")
๐ค Output: Task 1 started Task 2 started (2 second pause) Task 1 finished Task 2 finished Both tasks done
โณ Example 2: Waiting for threads to finish with join
This example shows how join() makes the main program wait until each thread completes.
import threading
import time
def slow_calculation():
print("Calculating...")
time.sleep(3)
print("Calculation complete")
worker = threading.Thread(target=slow_calculation)
worker.start()
print("Main program continues while thread runs")
worker.join()
print("Main program waited for thread to finish")
๐ค Output: Calculating... Main program continues while thread runs (3 second pause) Calculation complete Main program waited for thread to finish
๐ Example 3: Basic asyncio with async and await
This example shows how to run two asynchronous tasks that yield control to each other.
import asyncio
async def fetch_data():
print("Fetching data...")
await asyncio.sleep(2)
print("Data received")
return "result"
async def main():
print("Starting async work")
data = await fetch_data()
print(f"Got: {data}")
asyncio.run(main())
๐ค Output: Starting async work Fetching data... (2 second pause) Data received Got: result
๐ Example 4: Running multiple asyncio tasks concurrently
This example shows how to run several async tasks at the same time using gather.
import asyncio
async def download_file(file_id):
print(f"Downloading file {file_id}")
await asyncio.sleep(1)
print(f"File {file_id} downloaded")
return f"file_{file_id}.txt"
async def main():
results = await asyncio.gather(
download_file(1),
download_file(2),
download_file(3)
)
print(f"All files: {results}")
asyncio.run(main())
๐ค Output: Downloading file 1 Downloading file 2 Downloading file 3 (1 second pause) File 1 downloaded File 2 downloaded File 3 downloaded All files: ['file_1.txt', 'file_2.txt', 'file_3.txt']
๐ก Example 5: Comparing threading and asyncio for network requests
This example shows both approaches for making multiple web requests.
import threading
import asyncio
import time
# Threading version
def fetch_url_thread(url_id):
print(f"Thread fetching URL {url_id}")
time.sleep(2)
print(f"Thread got URL {url_id}")
def run_threads():
threads = []
for i in range(3):
t = threading.Thread(target=fetch_url_thread, args=(i,))
threads.append(t)
t.start()
for t in threads:
t.join()
# Asyncio version
async def fetch_url_async(url_id):
print(f"Async fetching URL {url_id}")
await asyncio.sleep(2)
print(f"Async got URL {url_id}")
async def run_async():
tasks = [fetch_url_async(i) for i in range(3)]
await asyncio.gather(*tasks)
print("Threading approach:")
run_threads()
print("\nAsyncio approach:")
asyncio.run(run_async())
๐ค Output: Threading approach: Thread fetching URL 0 Thread fetching URL 1 Thread fetching URL 2 (2 second pause) Thread got URL 0 Thread got URL 1 Thread got URL 2 Asyncio approach: Async fetching URL 0 Async fetching URL 1 Async fetching URL 2 (2 second pause) Async got URL 0 Async got URL 1 Async got URL 2
Comparison Table: Threading vs Asyncio
| Feature | Threading | Asyncio |
|---|---|---|
| Best for | I/O-bound tasks (file reads, network) | I/O-bound tasks with many connections |
| How it runs | True parallel (OS manages threads) | Single-threaded cooperative multitasking |
| Overhead | Higher (each thread uses system resources) | Lower (runs in one thread) |
| Complexity | Simpler to understand | Requires async/await syntax |
| CPU-bound work | Can use multiple CPU cores | Cannot use multiple CPU cores |
| Common use case | Web scraping, file processing | Web servers, API clients |
๐งญ Context Introduction
As you move beyond the basics of Python scripting, you'll encounter situations where your scripts need to do multiple things at once. Imagine downloading several files from a server, processing multiple log files simultaneously, or handling many network connections at the same time. This is where parallel execution comes into play.
Python offers two main engines for running tasks concurrently: Threading and Asyncio. While they both help you achieve parallelism, they work in fundamentally different ways. Understanding when to use each will make your scripts faster, more efficient, and more responsive.
โ๏ธ What is Threading?
Threading allows your script to run multiple operations at the same time by splitting work into separate threads. Each thread runs independently but shares the same memory space.
Key characteristics: - Threads run in parallel but are limited by Python's Global Interpreter Lock (GIL) for CPU-bound tasks - Best suited for I/O-bound tasks like reading files, making network requests, or waiting for database queries - Threads can share data easily, but you need to be careful with race conditions - Threading is built into Python's standard library with the threading module
When to use threading: - Downloading multiple files from the internet - Reading and writing many small files at once - Handling multiple user connections in a server - Making several API calls simultaneously
๐ What is Asyncio?
Asyncio is a newer approach that uses a single thread to manage multiple tasks by switching between them when they are waiting. Think of it as a chef who starts cooking one dish, then while it simmers, starts chopping vegetables for another dish, then returns to the first dish.
Key characteristics: - Runs everything on a single thread using cooperative multitasking - Tasks voluntarily yield control when they are waiting (I/O operations) - Extremely efficient for handling thousands of concurrent connections - Uses the asyncio module and the async/await syntax
When to use asyncio: - Building web servers or API clients - Handling many network connections simultaneously - Running tasks that spend most of their time waiting - Creating responsive command-line tools
๐ ๏ธ Core Differences Between Threading and Asyncio
| Feature | Threading | Asyncio |
|---|---|---|
| Execution model | Multiple threads running in parallel | Single thread with cooperative switching |
| Best for | I/O-bound tasks with blocking operations | I/O-bound tasks with many concurrent operations |
| CPU-bound tasks | Limited by GIL (not ideal) | Not suitable (single thread) |
| Complexity | Moderate (need locks and synchronization) | Lower (no race conditions on shared data) |
| Memory usage | Higher (each thread has its own stack) | Lower (single stack for all tasks) |
| Learning curve | Easier to understand initially | Requires understanding async/await pattern |
| Library support | Works with most Python libraries | Requires async-compatible libraries |
๐งช Simple Example: Threading in Action
Imagine you need to check if five different servers are online. With threading, you can check all five at once instead of waiting for each one sequentially.
How it works: - You create a Thread object for each server check - Each thread runs the check function independently - The main script waits for all threads to finish using join() - Results are collected in a shared list
Basic pattern: - Import the threading module - Define a function that does the work - Create thread objects with Thread(target=your_function, args=(...)) - Start each thread with start() - Wait for completion with join()
๐ Simple Example: Asyncio in Action
Now consider the same server-checking task using asyncio. Instead of creating threads, you create coroutines that can pause and resume.
How it works: - You define an async function (coroutine) for each server check - You gather all coroutines using asyncio.gather() - The event loop switches between coroutines when they are waiting - Everything runs on a single thread
Basic pattern: - Import the asyncio module - Define an async function with async def - Use await inside the function for I/O operations - Run the main coroutine with asyncio.run()
๐ฏ Choosing the Right Engine
Choose Threading when: - You are working with existing libraries that are not async-compatible - You need to run blocking operations like file I/O or database queries - Your tasks are relatively few (tens to hundreds) - You want simpler error handling and debugging
Choose Asyncio when: - You are building applications with many concurrent connections (hundreds to thousands) - You are using async-compatible libraries like aiohttp or asyncpg - You want better performance with lower memory overhead - You are writing new code from scratch and can design for async
๐ต๏ธ Common Pitfalls to Avoid
With Threading: - Forgetting to use locks when multiple threads modify shared data - Creating too many threads (each thread consumes memory) - Assuming threads speed up CPU-heavy calculations - Not handling exceptions within threads properly
With Asyncio: - Using blocking functions inside async code (this blocks the entire event loop) - Forgetting to use await when calling async functions - Mixing threading and asyncio without understanding the implications - Creating too many coroutines without proper resource management
๐ Practical Advice for Engineers
Start with threading if you are new to parallel programming. It is more intuitive and works with almost any Python library you already know. Once you are comfortable with threading, explore asyncio for scenarios where you need to handle many concurrent operations efficiently.
A good rule of thumb: if your script spends most of its time waiting (network calls, disk I/O, API requests), both threading and asyncio can help. If you need to handle thousands of connections, asyncio is the better choice. If you are working with existing blocking libraries, threading is more practical.
Remember that neither threading nor asyncio will speed up pure CPU calculations in Python. For CPU-bound tasks, you would need to explore multiprocessing, which is a separate topic beyond this curriculum.
๐ Next Steps
- Practice converting a sequential script that makes multiple API calls into a threaded version
- Try rewriting the same script using asyncio with an async HTTP library
- Experiment with mixing both approaches in a single script (advanced)
- Explore the concurrent.futures module for a higher-level threading interface
Both threading and asyncio are powerful tools in your Python toolkit. Mastering them will allow you to write scripts that are faster, more responsive, and better suited for real-world engineering challenges.
Interactive Views
You are currently in ๐ All-in-One mode. Use the tabs at the top to switch to ๐ Theory Only or ๐ป Code Only views.
This section shows two ways to run multiple tasks at the same time in Python โ threading for I/O-bound work and asyncio for cooperative multitasking.
๐งต Example 1: Running two functions with threading
This example shows how to run two separate functions at the same time using threads.
import threading
import time
def task_one():
print("Task 1 started")
time.sleep(2)
print("Task 1 finished")
def task_two():
print("Task 2 started")
time.sleep(2)
print("Task 2 finished")
thread1 = threading.Thread(target=task_one)
thread2 = threading.Thread(target=task_two)
thread1.start()
thread2.start()
thread1.join()
thread2.join()
print("Both tasks done")
๐ค Output: Task 1 started Task 2 started (2 second pause) Task 1 finished Task 2 finished Both tasks done
โณ Example 2: Waiting for threads to finish with join
This example shows how join() makes the main program wait until each thread completes.
import threading
import time
def slow_calculation():
print("Calculating...")
time.sleep(3)
print("Calculation complete")
worker = threading.Thread(target=slow_calculation)
worker.start()
print("Main program continues while thread runs")
worker.join()
print("Main program waited for thread to finish")
๐ค Output: Calculating... Main program continues while thread runs (3 second pause) Calculation complete Main program waited for thread to finish
๐ Example 3: Basic asyncio with async and await
This example shows how to run two asynchronous tasks that yield control to each other.
import asyncio
async def fetch_data():
print("Fetching data...")
await asyncio.sleep(2)
print("Data received")
return "result"
async def main():
print("Starting async work")
data = await fetch_data()
print(f"Got: {data}")
asyncio.run(main())
๐ค Output: Starting async work Fetching data... (2 second pause) Data received Got: result
๐ Example 4: Running multiple asyncio tasks concurrently
This example shows how to run several async tasks at the same time using gather.
import asyncio
async def download_file(file_id):
print(f"Downloading file {file_id}")
await asyncio.sleep(1)
print(f"File {file_id} downloaded")
return f"file_{file_id}.txt"
async def main():
results = await asyncio.gather(
download_file(1),
download_file(2),
download_file(3)
)
print(f"All files: {results}")
asyncio.run(main())
๐ค Output: Downloading file 1 Downloading file 2 Downloading file 3 (1 second pause) File 1 downloaded File 2 downloaded File 3 downloaded All files: ['file_1.txt', 'file_2.txt', 'file_3.txt']
๐ก Example 5: Comparing threading and asyncio for network requests
This example shows both approaches for making multiple web requests.
import threading
import asyncio
import time
# Threading version
def fetch_url_thread(url_id):
print(f"Thread fetching URL {url_id}")
time.sleep(2)
print(f"Thread got URL {url_id}")
def run_threads():
threads = []
for i in range(3):
t = threading.Thread(target=fetch_url_thread, args=(i,))
threads.append(t)
t.start()
for t in threads:
t.join()
# Asyncio version
async def fetch_url_async(url_id):
print(f"Async fetching URL {url_id}")
await asyncio.sleep(2)
print(f"Async got URL {url_id}")
async def run_async():
tasks = [fetch_url_async(i) for i in range(3)]
await asyncio.gather(*tasks)
print("Threading approach:")
run_threads()
print("\nAsyncio approach:")
asyncio.run(run_async())
๐ค Output: Threading approach: Thread fetching URL 0 Thread fetching URL 1 Thread fetching URL 2 (2 second pause) Thread got URL 0 Thread got URL 1 Thread got URL 2 Asyncio approach: Async fetching URL 0 Async fetching URL 1 Async fetching URL 2 (2 second pause) Async got URL 0 Async got URL 1 Async got URL 2
Comparison Table: Threading vs Asyncio
| Feature | Threading | Asyncio |
|---|---|---|
| Best for | I/O-bound tasks (file reads, network) | I/O-bound tasks with many connections |
| How it runs | True parallel (OS manages threads) | Single-threaded cooperative multitasking |
| Overhead | Higher (each thread uses system resources) | Lower (runs in one thread) |
| Complexity | Simpler to understand | Requires async/await syntax |
| CPU-bound work | Can use multiple CPU cores | Cannot use multiple CPU cores |
| Common use case | Web scraping, file processing | Web servers, API clients |