In Python, because of the interpreter lock, it’s not trivial to make your code run on all available CPU’s. You might have realized, that even when your program takes a long time to run, your computer seems to work under a low load. Python’s answer to this is the multiprocessing (MP) package. For beginners it’s difficult to use and in my opinion not really well documented. But sometimes this would save your application a lot of runtime especially when you’re working with a lot of data.
Half a year ago (in the beginning of 2018) I was facing the same issues. I thought “why is it so hard and unintuitive?”. After a lot of research and try and error I figured stuff out and decided to write a script that makes things more easy in the future.
Today I want to share this little python script. I tried to make it readable and easy to follow but to use it it’s enough if you read this article and run the example I provide. For further functionality like sharing variables between processes, I will do a separate blog post following soon.
You can download the script from my github repo >here< (e.g. click “Clone or download” and then “download ZIP”). You can then extract and copy it directly into your project directory. Alternatively you can copy it into your Python “site-packages” folder to make it accessible in all of your Python projects.
The file you just downloaded provides you the possibility to run a function of your choice in a loop, spawning a new process for each iteration. This makes it possible to utilize all cores of your CPU. To do so we first import the multiprocessing_for_kids script. To measure performance in the example below, we also need the time module:
import multiprocessing_for_kids as mulki import time
Example 1: Counting
In this example we want to count from 0 to 2 million. Sounds simple right? Well, we want to do it as fast as possible. To do so we need to split the counting task into smaller subtasks. In this case we use 10 partitions each counting 200.000 (=200k) numbers:
iter_val=1: 0 to 200k
iter_val=2: 200k to 400k
iter_val=3: 400k to 600k
iter_val=10: 1800k to 2 million
To do so I wrote a function that takes the current iterator iter_val as input, calculates the start and the end value (e.g. iter_val=2: FROM=200k and TO=400k) and executes the counting:
def countTo(iter_val, GOAL, STEPS, PRINT): t0 = time.time() # goal = 2 million, steps = 10 -> step_range = 200k STEP_RANGE = int(GOAL / STEPS) FROM = (iter_val - 1) * STEP_RANGE # e.g. step 1: 1-1 * 200k = 0 TO = iter_val * STEP_RANGE # e.g. step 1: 1 * 200k = 200k ''' the actual counting: ''' for i in range(FROM, TO + 1): # Loop to TO + 1 to get to 200k instead of 199.999 if PRINT: print(i) # Print the current job and time it took to execute: print("Finished:", iter_val, " from ", FROM, " to ", TO, "in", round(time.time() - t0, 1), "s")
For Beginners: It’s important to understand how you can use the iteration value (iter_val) in your function. Maybe you can come up with a for loop that uses the function above to do the counting (GOAL = 2000000) without multiprocessing. This will help you understand further steps.
for i in range(1,11): countTo(i, 2000000, 10, True)
Now lets see how we count to 2 million in 10 steps using the Multiprocessing for Kids package:
mulki.doMultiprocessingLoop(countTo, range(1,11), False, 1000000, 10, True)
Yes, it only takes one line of code to use a given function in a loop and apply Multiprocessing. Some people who have tried to work with MP might be excited!
doMultiprocessingLoop() is the most important function in the ‘mulki’ package. It takes your function (e.g. countTo), an iterator (e.g. range(1,11)) a boolean that describes if you want to break multiprocessing in the case a value is returned by your function (False) and any number of static parameters that you want to hand over to your function (e.g. in this case: 1000000, 10 and True). In our counting example you might want to increase the counting goal if your computer is faster than mine. This is especially important if you disable the output of the counted numbers (PRINT = False).
Okay, now we want to see the impact of multiprocessing over “normal” counting in a loop. I wrote another function called “example1” to do the comparison:
def example1(PRINT = False, WITHOUT_MP = False): print("Start...") t_start_mp = time.time() iterator = range(1, 11) # 1,2,3,4,5,6,7,8,9,10 GOAL = 2000000 # counting goal if not PRINT: # if we don't print every number GOAL = GOAL * 1000 # we count to 2 billion instead of 2 million # With Multiprocessing: mulki.doMultiprocessingLoop(countTo, iterator, False, GOAL, len(iterator), PRINT) print("Finished counting with Multiprocessing in ", round(time.time()-t_start_mp, 1), "s") # Without Multiprocessing: if WITHOUT_MP: t_start_normal = time.time() countTo(1, GOAL, 1, PRINT) print("Finished counting without Multiprocessing in ", round(time.time()-t_start_normal, 1), "s")
Run the example:
Finished: 4 from 600000000 to 800000000 in 15.8 s
Finished: 2 from 200000000 to 400000000 in 15.8 s
Finished: 1 from 0 to 200000000 in 15.8 s
Finished: 3 from 400000000 to 600000000 in 15.8 s
Finished: 5 from 800000000 to 1000000000 in 15.4 s
Finished: 6 from 1000000000 to 1200000000 in 16.4 s
Finished: 8 from 1400000000 to 1600000000 in 17.0 s
Finished: 7 from 1200000000 to 1400000000 in 17.1 s
Finished: 9 from 1600000000 to 1800000000 in 8.0 s
Finished: 10 from 1800000000 to 2000000000 in 7.4 s
Finished counting with Multiprocessing in 39.7 s
Finished: 1 from 0 to 2000000000 in 64.5 s
Finished counting without Multiprocessing in 64.5 s
As we can see, on my MacBook (cpu_count = 4) I have an improvement of 25 seconds (62 %) with multiprocessing over a normal loop. It is important to note that multiprocessing has some overhead, so the function you are using should take some time to run to see any improvement by using MP. In the example I provided the fastest way would actually be to separate the counting problem in only 4 parts (one for each CPU -> iterator = range(1,5)). This way I’ve reached 37.2 seconds runtime with MP which is an increase of 78 % over the ‘normal’ loop (without MP took 66.3 s in that run).
Another important thing to notice in the output is the order in which the counting steps are completed. For example process 4 finishes before 2 and 2 before 1. This shows us that each counting segment – or rather the different iterations of the function – are calculated in separate processes which finish individually.
As a final Image I want to provide you two screenshots of iStats with and without MP. This proves that we are actually doing multiprocessing here.
As mentioned in the beginning, another post will follow soon that provides examples on how to work with shared variables and return values from that doMultiprocessingLoop. It’s really easy so I think you will love it.
This was my first blog post ever! I really hope it was helpful. Feel free to comment and share.