Currently there are multiple levels of (scheduled) thread pools when a program start, depending on which profile is used (native vs remote). There is no contract in enforcing the maximum number of concurrent launches to maintain the app-fabric in a stable state.
E.g. currently, due to an implementation error, the AbstractProgramRuntimeService actually calls ProgramRunner.run one by one (it uses an executor with 0 core thread and a LinkedBlockingQueue). However, when it is launching program using the RemoteTwillRunnerService (for remote compute profile), inside that twill runner service it actually launch programs using an executor that has multiple threads, but it is not honoring the maximum concurrent launch setting app.program.launch.threads.
We should revisit the logic and have a better contract.