There should really be a option to limit to a specific number of instances, no matter what.
In the application settings menu all you can do is to limit the maximum number of IDLE instances, which I'm not sure if it works as intended. I mean I set the Max Idle Instances to 1 and the Min Pending Latency to 15 seconds, but I still see 2 instances running occasionally, for long period of times with no requests. Aren't they supposed to close after 15 min of being idle? And why does it even fire a seconds instance with those settings, considering that no request reached 15 seconds delay?
I run a simple "what's my IP" python app, that really doesn't need high performance. I mean it really doesn't make a difference if the response is after 100ms or 5 seconds, all it matters is that only one instance is running, so that those daily 28 instance hours don't ever run out.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…