Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
692 views
in Technique[技术] by (71.8m points)

multiprocessing - Python ProcessPoolExecutor with conditions

I have to process a large amount of image data and would like to use the .map() function from the concurrent.futures package to speed it up. The goal is to loop over all the images in a directory, process them, and then save them in another directory. This in itself is not a problem but I would like to save 90% of the processed images in one directory and the remaining 10% in another directory. How can I do this using .map()?

Without .map() I enumerate the images and then say:

if enumerator < (len(directory) * 0.9):
     save image in one directory
else:
     save image in another directory

How can I add this to the function I call with .map(), since I don't have access to the enumerator anymore?

Any help is very much appreciated!

All the best, snowe


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You can use additional arguments to the map function, these arguments should be iterators, 1 element from each iterator will be passed to each iteration your job pool goes through:

def my_function(file, sorting_bool):
  if sorting_bool:
    # do this with `file`
  else:
    # do that with `file`

total = len(directory)
sorter = lambda x: x < 0.9 * total
dir_sorted = map(sorter, range(total))
pool.map(my_function, directory, dir_sorted)

In general for other tasks you could send a job id and total id to your job:

def my_function(file, job_id, total_jobs):
  if job_id < total_jobs * 0.9:
    # Do this
  else:
    # Do that

total = len(directory)
pool.map(my_function, directory, range(total), lambda: total)

And then use those numbers however you'd like inside of your my_function

If you have an unknown number of total jobs you could still create a generator to create a counter:

def counter():
  i = 0
  while True:
    yield i
    i += 1

pool.map(my_function, counter(), other, args)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

2.1m questions

2.1m answers

60 comments

57.0k users

...