DZone
Performance Zone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
  • Refcardz
  • Trend Reports
  • Webinars
  • Zones
  • |
    • Agile
    • AI
    • Big Data
    • Cloud
    • Database
    • DevOps
    • Integration
    • IoT
    • Java
    • Microservices
    • Open Source
    • Performance
    • Security
    • Web Dev
DZone > Performance Zone > Code parallelization with joblib

Code parallelization with joblib

Giuseppe Vettigli user avatar by
Giuseppe Vettigli
·
May. 26, 14 · Performance Zone · Interview
Like (0)
Save
Tweet
13.90K Views

Join the DZone community and get the full member experience.

Join For Free

Recently I've been working on the parallelization of some Python code and I discovered Joblib. It is a library that supports pipelining and offers a good support for parallelization. In this post we will implement a (very naive) paraller matrix by matrix multiplication algorithm to show the parallelization capabilities of this library.

from joblib import Parallel, delayed

def parallel_dot(A,B,n_jobs=2):
    """
     Computes A x B using more CPUs.
     This works only when the number 
     of rows of A and the n_jobs are even.
    """
    parallelizer = Parallel(n_jobs=n_jobs)
    # this iterator returns the functions to execute for each task
    tasks_iterator = ( delayed(np.dot)(A_block,B) 
                      for A_block in np.split(A,n_jobs) )
    result = parallelizer( tasks_iterator )
    # merging the output of the jobs
    return np.vstack(result)

This function spreads the computation across more precesses. The strategy applied to distribute the data is very simple. Each process has the full matrix B and a contiguous block of rows of A, so it can compute a block of rows A*B. In the end, the result of each process is stacked to build final matrix.

Let's compare the parallel version of the algorithm with the sequential one:

A = np.random.randint(0,high=10,size=(1000,1000))
B = np.random.randint(0,high=10,size=(1000,1000))

%time _ = np.dot(A,B)

CPU times: user 13.2 s, sys: 36 ms, total: 13.2 s Wall time: 13.4 s

%time _ = parallel_dot(A,B,n_jobs=2)

CPU times: user 92 ms, sys: 76 ms, total: 168 ms Wall time: 8.49 s

Wow, we had a speedup of 1.6X, not bad for a so naive algorithm. It's important to notice that the arguments passed as input to the Parallel call are serialized and reallocated in the memory of each worker process. Which means that the last time that parallel_dot have been called, the matrix B have been entirely replicated two times in memory. To avoid this problem, we can dump the matrices on the filesystem and pass a reference to the worker to open them as memory map.

import tempfile
import os
from joblib import load, dump

# saving A and B to a local file for memmapping
temp_folder = tempfile.mkdtemp()
filenameA = os.path.join(temp_folder, 'A.mmap')
dump(A, filenameA)
filenameB = os.path.join(temp_folder, 'B.mmap')
dump(A, filenameB)

Now, when parallel_dot(A_memmap,B_memmap,n_jobs=2) is called, both the processes created will use only a reference to the matrix B...


Published at DZone with permission of Giuseppe Vettigli, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • The Right Way to Hybridize Your Product Development Technique
  • Top Six Kubernetes Best Practices for Fleet Management
  • Memory Debugging and Watch Annotations
  • Choosing Between GraphQL Vs REST

Comments

Performance Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • MVB Program
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends:

DZone.com is powered by 

AnswerHub logo