Project

General

Profile

Actions

Bug #20894

closed

more config defaults

Added by Peter Amstutz 9 months ago. Updated 8 months ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Deployment
Target version:
Story points:
-
Release relationship:
Auto

Description

I just found an interesting cloud dispatch bug... we start with maxConcurrency at 16 and SupervisorFraction at .30. If each workflow only starts one subprocess, we use .60 of maxConcurrency. problem is, maxSupervisors is based on maxConcurrency, so even though there is a big backlog of supervisor processes that want to run, they don't get scheduled, but because we're only using 60% capacity, it doesn't try to raise maxConcurrency either

I think the answer is that the default value of SupervisorFraction has to be 50%

and/or the default value of InitialQuotaEstimate should be 0 (which sets it to match MaxInstances)

I didn't see this on the scale cluster test, but I had already adjusted SupervisorFraction to 0.45 and InitialQuotaEstimate to 400, and the synthetic workflow does have a phase where it submits two parallel jobs, which would push the queue up to the maximum

so I wonder, if it didn't have a parallel job phase, it would have been stuck at 400


Subtasks 1 (0 open1 closed)

Task #20895: Review 20894-instances-defaultResolvedPeter Amstutz08/24/2023Actions
Actions #1

Updated by Peter Amstutz 9 months ago

  • Status changed from New to In Progress
Actions #2

Updated by Peter Amstutz 9 months ago

  • Description updated (diff)
Actions #3

Updated by Peter Amstutz 9 months ago

Actions #4

Updated by Peter Amstutz 9 months ago

  • Category set to Deployment
Actions #5

Updated by Peter Amstutz 9 months ago

Actions #6

Updated by Peter Amstutz 9 months ago

  • Assigned To set to Peter Amstutz
Actions #7

Updated by Peter Amstutz 9 months ago

Actions #8

Updated by Peter Amstutz 9 months ago

  • Release set to 66
Actions #9

Updated by Lucas Di Pentima 9 months ago

This LGTM, thanks!

Actions #10

Updated by Peter Amstutz 9 months ago

  • Status changed from In Progress to Resolved
Actions

Also available in: Atom PDF