Project

General

Profile

Actions

Idea #20239

open

Dispatcher calculates final container cost after runner/instance disappears

Added by Brett Smith about 1 year ago. Updated about 1 year ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
Crunch
Target version:
Start date:
03/21/2023
Due date:
03/21/2023 (about 13 months late)
Story points:
-

Description

Normally the "container cleanup" stuff is done by lib/dispatchcloud/scheduler, which hits the cancel button here:

                        } else if !exited.IsZero() && qUpdated.After(exited) {
                                go sch.cancel(uuid, "state=Running after crunch-run exited")

This could stash cost-so-far while updating state to Cancelled. The only hitch is that scheduler doesn't know anything about prices.

(An alternative would be for the worker pool update the cost itself, in closeRunner() -- but I think it gets complicated quickly when we consider retrying failures, and races between "update cost field" (in worker pool) and "set state to cancelled" (scheduler).)

I'm thinking we need to update the Running() call in scheduler/interfaces.go so that it returns a map[string]RunStatus, where RunStatus is something like:

type RunStatus interface {
        Exited() bool
        ExitTime() time.Time
        Cost() float64
}

...where Cost>0 means the scheduler should use that as the final cost when it sets state=Cancelled. That way, worker pool can implement a concrete runStatus type with a *worker field and a Cost method that does the calculation, instead of computing all of the costs on every call to Running() only to have them unused/ignored nearly all of the time.

(I'm sneaking in the separate Exited bool here to be more clear/explicit than the current "zero time means not exited" approach. Not strictly necessary, but might as well, if we're going to make it a struct, right?)

Worker pool can populate that with calculated cost if we
  • move most of the calculateCost() code from lib/crunchrun to lib/cloud, and
  • in lib/worker, use NormalizePriceHistory to combine/remember the entire cloud-provided price history of each worker instead of just sending the latest data points to crunch-run and forgetting them.

Related issues

Follows Arvados - Bug #19967: crunch-run periodically updates container costResolvedBrett Smith03/20/2023Actions
Actions #1

Updated by Brett Smith about 1 year ago

  • Follows Bug #19967: crunch-run periodically updates container cost added
Actions

Also available in: Atom PDF