What are Threads in Python?
Threads in Python are separate flows of execution. Typically, a simple Python program is single-threaded. This means everything happens in a sequence one after the other. Your program will execute line-by-line and only after one line has finished will the next one begin.
Implementing additional threads means that you can have parallel executions happening at once. With two threads, it’s like having two programs running together, both running their lines of code one after the other.
How to Spawn a New Thread in Python
Creating (or spawning) a new thread is quite simple thanks to the in-built threading library.
First, at the top of your Python file, import threading:
from threading import Thread
Next, create a function which you’d like to run on a secondary thread:
# Prints the given variable 3 times, with a 1 second delay between each print.
def repeatedlyPrint(variable):
for i in range(3):
print(variable)
time.sleep(1)
Finally, call the function on a new thread by first creating an instance of the Thread
class, then calling it’s start()
method:
# Start a new thread.
newThread = Thread(target=lambda: repeatedlyPrint("New thread"))
newThread.start()
Here’s a full example of implementing threading, putting the above code into clear context:
from threading import Thread
import time
# Prints the given variable 3 times, with a 1 second delay between each print.
def repeatedlyPrint(variable):
for i in range(3):
print(variable)
time.sleep(1)
# Start a new thread.
newThread = Thread(target=lambda: repeatedlyPrint("New thread"))
newThread.start()
time.sleep(0.5)
# Start the main thread.
repeatedlyPrint("Main thread")
This will output:
Secondary thread
Main thread
Secondary thread
Main thread
Secondary thread
Main thread
Note: You have to spawn the new thread before running anything else on the main thread. Otherwise, the main thread would be blocked by it’s own execution.
How to Return Values From a Python Thread
Now that we’ve seen how threads work, we can talk about returning data from them. To do this, we’ll utilise some simple inheritance to extend the Thread
class. We’re doing this so that when a process is complete, it’ll save the value as an attribute of the Thread
instance itself. We can check this from the main thread at any time to gain access to the assigned values.
class ReturnableThread(Thread):
# This class is a subclass of Thread that allows the thread to return a value.
def __init__(self, target):
Thread.__init__(self)
self.target = target
self.result = None
def run(self) -> None:
self.result = self.target()
First, the class is declared as ReturnableThread
and set be an extension of the existing Thread
class.
Next, in the __init__
method, we still accept a target
as the parent class expects, but then we also create a new attribute called result
.
Finally, we overwrite the parent’s run
method to still run the given function, but we assign the function value to the result
attribute.
Here’s a complete example:
from threading import Thread
import time
import random
class ReturnableThread(Thread):
# This class is a subclass of Thread that allows the thread to return a value.
def __init__(self, target):
Thread.__init__(self)
self.target = target
self.result = None
def run(self) -> None:
self.result = self.target()
# Returns a random number after a 1 second delay
def randomNumber() -> int:
time.sleep(1)
return random.randint(0, 100)
# Start a new thread
newThread = ReturnableThread(target=lambda: randomNumber())
newThread.start()
# Wait for the thread to finish
while newThread.result is None:
time.sleep(0.1)
print(newThread.result)
- On line 16, there’s a function called
randomNumber
which, after a 1-second delay, returns a random number between 1 and 100. - On line 21, we spawn a new thread using our new
ReturnableThread
class and then call its start method. - On line 25, we then check if the
result
attribute is stillNone
. While it is, we simply wait another 0.1 seconds and check again. - Once the
result
attribute is no longerNone
, we know the thread has finished executing and we can then print the value it’s provided, thus returning a value from a different thread.
Using this approach you could spawn as many threads as you like by using loops and lists. Here’s an example that does the same as the above, but for 10 threads at a time:
# Create a list to store all the threads
allThreads = []
# Create 10 threads running the randomNumber function
for i in range(10):
newThread = ReturnableThread(target=lambda: randomNumber())
newThread.start()
allThreads.append(newThread)
# Wait for the thread to finish
for thread in allThreads:
while thread.result is None:
time.sleep(0.1)
# Print the result
for thread in allThreads:
print(thread.result)
Note: This uses the
ReturnableThread
subclass which was created in the previous example, as well as the same imports from the previous example. I left these off of this code block to avoid repetition, but it is required for this to work.
When run, this will output 10 random numbers, each on a new line.
Applications of Returning Values from Python Threads
Threads are very useful for i/o blocking operations. These are pieces of code that are sluggish due to awaiting inputs or outputs. Take, for example, web scraping. We scraping bots can be sluggish if they’re fetching hundreds of web pages as each one might take a couple of seconds to load. While this process doesn’t take a huge amount of computing power, you’re left waiting for the data to load in.
Using threads, you could fetch each page on its thread, and then return the data to the main thread afterwards. This can allow you to fetch multiple at a time, reducing the total time you’re sitting waiting for incoming/outgoing data.
Threads are not useful if you’re being blocked by computation power. If your program is sluggish because of calculations that are running, spawning another thread on the same CPU core won’t do you any favours. In that case, you need to look at multiprocessing. While it seems similar on the surface, multiprocessing allows the use of more of your system’s processor, whereas threading does not.
Note: When spawning high quantities of threads, it’s important to manage them appropriately. Creating batches of threads and awaiting them to finish before spawning more is my personal approach. This avoids overwhelming systems with hundreds or thousands of threads at once.
Conclusion
You should now be able to spawn a new thread for a function to run on and, when the function is completed, you can access it from your main thread. I hope this proved useful for you in speeding up your scripts and saving time, as well as helping you to level up your own Python skills.
Leave a Reply