Project

General

Profile

Actions

Idea #3898

closed

[API] Job model has a single state attribute that can be updated and read instead of paying attention to running and success flags.

Added by Tom Clegg over 9 years ago. Updated over 9 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Radhika Chippada
Category:
-
Target version:
Start date:
09/19/2014
Due date:
Story points:
0.5

Description

Before validation:
  • If state has changed to Running, set started_at to now
  • If state has changed to Failed or Complete, set finished_at to now
  • If state has changed to Cancelled, set cancelled_at to now
  • If state has changed at all, set running and success (see below)
  • If state has not changed, and any of {running, success, cancelled_at} have changed, update state accordingly
  • If state is nil, update state according to running, success, cancelled_at
Validation:
  • If state has changed to Running, and is_locked_by_uuid is nil, return false
Translating from running/success/cancelled_at to state:
running success cancelled_at state
any any not nil Cancelled
any false nil Failed
any true nil Complete
true nil nil Running
false nil nil Queued
Translating from state to running/success:
state running success
Queued false nil
Running true nil
Cancelled false false
Failed false false
Complete false true

Subtasks 4 (0 open4 closed)

Task #3906: Add state columnResolvedRadhika Chippada09/19/2014Actions
Task #3939: review 3898-job-state-attr-fixesResolvedWard Vandewege09/19/2014Actions
Task #3934: Review branch: 3898-job-state-attrResolvedPeter Amstutz09/19/2014Actions
Task #3951: review 3898-job-state-attr-TCResolved09/19/2014Actions
Actions #1

Updated by Tom Clegg over 9 years ago

  • Target version changed from Arvados Future Sprints to 2014-10-08 sprint
Actions #2

Updated by Peter Amstutz over 9 years ago

  • Assigned To set to Peter Amstutz
Actions #3

Updated by Peter Amstutz over 9 years ago

  • Assigned To changed from Peter Amstutz to Radhika Chippada
Actions #4

Updated by Radhika Chippada over 9 years ago

  • Status changed from New to In Progress
  • Allowed States:

Queued, Running, Cancelled, Failed, Complete

  • Use the following to determine state from current success and running attributes:
Running Success State
any False Failed
any True Complete
True nil Running
Queued started_at, is_locked_by_uuid, cancelled_at and success are all nil
Cancelled cancelled_at attribute is set
  • Fail migration if Job cannot enter any of these 5 states.
Actions #5

Updated by Tom Clegg over 9 years ago

  • Description updated (diff)
Actions #6

Updated by Radhika Chippada over 9 years ago

I set started_at, cancelled_at, and finished_at to Time.now only when they are not set. Peter thinks the crunch job would be setting them, and better to set them if they are not already set.

Actions #7

Updated by Radhika Chippada over 9 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 50 to 100

Applied in changeset arvados|commit:3cc699f70a514878b7ce7adbc3c96e73f92fdf82.

Actions #8

Updated by Ward Vandewege over 9 years ago

  • Status changed from Resolved to In Progress
Actions #9

Updated by Radhika Chippada over 9 years ago

Review comments for branch: 3898-job-state-attr-fixes

LGTM. I was also initially concerned about including !is_locked_by_uuid in checking the status when state is null. This seems to have been problematic and I do not see any harm in omitting this in this check.

Thanks for addressing this.

Actions #10

Updated by Tom Clegg over 9 years ago

Nitpicks about this comment
  • +        # race condition for jobs that have just been grabbed by crunch-dispatch but haven't been marked as started yet...
  • It's crunch-job that does the grabbing, not crunch-dispatch
  • crunch-job sets started_at and is_locked_by_uuid in the same update transaction, so is this race really possible?

I think the bigger problem to fix here is that Job is now trying to manipulate the old state fields in interesting ways, even when clients are not using the new state field. This seems likely to create/expose new bugs in code paths that we are just about to replace, which seems a dubious use of time...

Actions #11

Updated by Anonymous over 9 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 75 to 100

Applied in changeset arvados|commit:d6478bb286c0bb1e7b8af67fb4800db792200022.

Actions

Also available in: Atom PDF