In Python, by default we have three options to add parallel processing to our applications:

  • threading
  • multiprocessing
  • concurrent.futures

The first option, threading is useful with I/O and networking tasks, as the parallelization remains in a single core.

The second option, multiprocessing, is used with intensive CPU tasks, and the processing will be distributed in all the cores of our machine. The disadvantage of multiprocessing is that all the information and objects should be serializable. To know is an object is serializable, you can try to pickle and unpickle. If the process goes without error, you can use multiprocessing as a rule of the tumb.

The last option, concurrent.futures offers an API to use threading and multiprocessing with the same interface. The interface, in my opinion, is cleaner, and you can start programming and later on decide if you want to make use of threading or multiprocessing as the backend.

In this example, I will use the backend of threading

Let’s start. First, we import the modules.

import concurrent.futures
import time

Now let’s write a function that waits 5 seconds and return the text passed as the argument

def echo(text):
    time.sleep(5)
    return text

Now, lets try:

%%time
print(echo('Hello') + ' ' + echo('world'))
Hello world
Wall time: 10 s

As expected, the command took 10s to complete

Now we are going to apply the concurrent.futures to our application. The differences are that in this case, we do not call our functions directly, but we pass the function name and the arguments as arguments of the submit function of concurrent.futures.Executor. Another critical difference is the immediate return of our call is a Future object, and if we want the actual return, we must use the result() method.

If you did not understand the last paragraph, do not despair, it is easier to see with an example:

# First we must initialize the executor
executor = concurrent.futures.ThreadPoolExecutor()
# Now we can submit the tasks
example_task = executor.submit(echo, 'Hello')

Now we can see that the tasks are not functions, but a Future object.

print(type(example_task))
<class 'concurrent.futures._base.Future'>

And if we wanted the return value, we use the result() method.

print(example_task.result())
Hello

Now to print Hello world, we do:

%%time
task1 = executor.submit(echo, 'Hello')
task2 = executor.submit(echo, 'world')
print(task1.result() + ' ' + task2.result())
Hello world
Wall time: 5.01 s

We see that it took 5 seconds.