Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
800 views
in Technique[技术] by (71.8m points)

terminal - How to start a python script using the multiprocessing library (with map_async) from the console

I am sorry for this rather long question but, since it is my first question on Stackoverflow, I wanted to be thorough in describing my problem and what I already tried. I am doing simulations of stochastic processes and thought it to be a good idea to use multiprocessing in order to increase the speed of my simulations . Since the individual processes have no need to share information with each other, this is really a trivial application of multiprocessing – unfortunately I struggle with calling my script from the console. My code for a testfunction looks like this:

#myscript.py
from multiprocessing import Pool

def testFunc (inputs):
    print(inputs)

def multi():
    print('Test2')
    pool = Pool()
    pool.map_async(testFunc, range(10))

if __name__ == '__main__':
    print('Test1')
    multi()

This works absolutely fine as long as I run the code from within my Spyder IDE. As the next step I want to execute my script on my university's cluster which I access via a slurm script; therefore, I need to be able to execute my python script via a bash script. Here I got some unexpected results. What I tried – on my Mac Book Pro with iOS 10.15.7 and a work station with Ubuntu 18.04.5 – are the following console inputs: python myscript.py and python -c "from myscript import multi; multi()". In each case my only output is Test1 and Test2, and testFunc never seems to be called. Following this answer Using python multiprocessing Pool in the terminal and in code modules for Django or Flask, I also tried various versions of omitting the if __name__ == '__main__' and importing the relevant functions to another module. For example I tried `

#myscript.py
from multiprocessing import Pool

def testFunc (inputs):
    print(inputs)

pool = Pool()
pool.map_async(testFunc, range(10))

But all to no prevail. To confuse me even further I now found out that first opening the python interpreter of the console by simply typing python, pressing enter and then executing

from myscript import multi
multi()

inside the python interpreter does work. As I said, I am very confused by this, since I thought this to be equivalent to python -c "from myscript import multi; multi()" and I really don't understand why one works and the other doesn't. Trying to reproduce this success I also tried executing the following bash script

python - <<'END_SCRIPT'
from multiTest import multi
multi()
END_SCRIPT

but, alas, also this doesn't work. As a last "dicovery", I found out that all those problems only arise when using map_async instead of just map – however, I think that for my application asynchron processes are preferable.

I would be really grateful if someone could shed light on this mystery (at least for me it is a mystery). Also, as I said this is my first question on Stackoverflow, so I apologize if I forgot relevant information or did accidentally not follow the formatting guidelines. All comments or edits helping me to improve my questions (and answers) in the future are also much appreciated!

question from:https://stackoverflow.com/questions/66046016/how-to-start-a-python-script-using-the-multiprocessing-library-with-map-async

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You aren't waiting for the pool to finish what it's doing before your program exits.

def multi():
    print('Test2')
    with Pool() as pool:
        result = pool.map_async(testFunc, range(10))
        result.wait()

If the order in which the subprocesses process things isn't relevant, I'd suggest

with Pool() as pool:
    for result in pool.imap_unordered(testFunc, range(10), 5):
        pass

(change 5, the chunk size parameter, to taste.)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...