Parallel Python

Overview

Parallel Python is a python module which provides mechanism for parallel execution of python code on SMP (systems with multiple processors or cores) and clusters (computers connected via network).

It is light, easy to install and integrate with other python software.

Parallel Python is an open source and cross-platform module written in pure python



Features

Parallel execution of python code on SMP and clusters

Easy to understand and implement job-based parallelization technique (easy to convert serial application in parallel)

Automatic detection of the optimal configuration (by default the number of worker processes is set to the number of effective processors)

Dynamic processors allocation (number of worker processes can be changed at runtime)

Low overhead for subsequent jobs with the same function (transparent caching is implemented to decrease the overhead)

Dynamic load balancing (jobs are distributed between processors at runtime)

Fault-tolerance (if one of the nodes fails tasks are rescheduled on others)

Auto-discovery of computational resources

Dynamic allocation of computational resources (consequence of auto-discovery and fault-tolerance)

SHA based authentication for network connections

Cross-platform portability and interoperability (Windows, Linux, Unix, Mac OS X)

Cross-architecture portability and interoperability (x86, x86-64, etc.)



Open source



Motivation

Nowadays software written in python finds applications in broad range of the categories including business logic, data analysis and scientific calculations. This together with wide availability of SMP computers (multi-processor or multi-core) and clusters (computers connected via network) on the market create the demand in parallel execution of python code.



The most simple and common way to write parallel applications for SMP computers is to use threads. Although, it appears that if the application is computation-bound using 'thread' or 'threading' python modules will not allow to run python byte-code in parallel. The reason is that python interpreter uses GIL (Global Interpreter Lock) for internal bookkeeping. This lock allows to execute only one python byte-code instruction at a time even on an SMP computer.

Parallel Python module overcomes this limitation and provides a simple way to write parallel python applications. Internally ppsmp uses processes and IPC (Inter Process Communications) to organize parallel computations. All the details and complexity of the latter are completely taken care of, and your application just submits jobs and retrieves their results (the easiest way to write parallel applications).

To make things even better, the software written with Parallel Python works in parallel even on many computers connected via local network or Internet. Cross-platform portability and dynamic load-balancing allows Parallel Python to parallelize computations efficiently even on heterogeneous and multi-platform clusters.

Installation

Any platform: download a module archive and extract it to a local directory. Run the setup script: python setup.py install

Windows: download and execute windows installer binary.

Documentation

Module API

Quick start guide, SMP

Quick start guide, clusters

Advanced guide, clusters

Command line options, ppserver.py

Parallel Python FAQ

Examples

Parallel Python usage examples

Downloads

Parallel Python downloads