Python并行处理框架 Jug

openkk 12年前

Jug 是一个基于任务的并行处理框架,采用 Python 编写,可用来在不同的机器上运行同一个任务,使用 NFS 做文件系统的通讯。

  • Persistent data across runs
  • Re-use partial results if you change the algorithms (for example, if you search over a few more parameters for the best, then it will reuse the pre-computed values). Normally, I have a main computation script and then write a second visualisation script to plot out the results or compute some summary statistics and it's good if the second script is easy to write, easy to change, and reuses all computational results seamlessly.
  • Supports concurrency with a very flexible system: CPUs can join the computation at any time. This allows it to be used in batch processing systems.
  • You can check up on the status of the computation at any time (jug status)
  • Two backends: file-based if all the processors share a filesystem (works over NFS too) or redis based if they can all connect to the same redis server.

示例代码:

from jug import TaskGenerator  from time import sleep    @TaskGenerator  def is_prime(n):      sleep(1.)      for j in xrange(2,n-1):          if (n % j) == 0:              return False      return True    primes100 = map(is_prime, xrange(2,101))

项目主页:http://www.open-open.com/lib/view/home/1339246462271