Project

General

Profile

Actions

Bug #9362

closed

[CWL] arvados-cwl-runner should reuse the most recent run

Added by Jiayong Li almost 8 years ago. Updated over 3 years ago.

Status:
Closed
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
-
Story points:
-

Description

Right now, arvados-cwl-runner --enable-reuse reuses the oldest version of the run, I think it should reuse the most recent run instead.

To be more precise, consider the following scenario, where the tools are all deterministic. In run 1, job A went through, but upon examining the output I concluded that the compute nodes made an error. In run 2, I used --disable-reuse and had a reasonable result of job A. Now in all of the future pipelines that have job A, I would like to reuse the result of job A of run 2, but the default behavior of arvados-cwl-runner reuses that of run 1 instead.

I understand the functionality of specifying which run to reuse would be quite complicated, but I think the simpler solution is to reuse the most recent run instead of the oldest one. The rationale here is that if the an earlier run was successful, one wouldn't want to run it again.

Here's a concrete example of the above behavior. I ran snap_freebayes_hu34D5B9. The pipeline https://workbench.f48sn.arvadosapi.com/pipeline_instances/f48sn-d1hrv-vi7dgz6phq60fm8 had a bad run (explained in #9361). So I reran the pipeline with --disable-reuse and had a successful run https://workbench.f48sn.arvadosapi.com/pipeline_instances/f48sn-d1hrv-of9pj0lw5h830m5

Now I'm running snap_gatk_hu34D5B9, naturally I turned on --enable-reuse to reuse the successful alignments in f48sn-d1hrv-of9pj0lw5h830m5, but it reused the bad alignments in f48sn-d1hrv-vi7dgz6phq60fm8 instead.
https://workbench.f48sn.arvadosapi.com/pipeline_instances/f48sn-d1hrv-soqcqpjh26gntc4

The command I was using is

arvados-cwl-runner --debug --enable-reuse --local --wait --project-uuid f48sn-j7d0g-fnuojiyi5vnwigu main-snap_freebayes_hu34D5B9.cwl main-snap_freebayes_hu34D5B9-samples.json

and
arvados-cwl-runner --debug --enable-reuse --local --wait --project-uuid f48sn-j7d0g-fnuojiyi5vnwigu main-snap_gatk_hu34D5B9.cwl main-snap_gatk_hu34D5B9-samples.json


Related issues

Related to Arvados - Bug #9361: [CWL] arvados-cwl-runner unknown issue of job reuseResolvedActions
Actions #1

Updated by Jiayong Li almost 8 years ago

  • Subject changed from [CWL] arvados-cwl-runner reuse the most recent run to [CWL] arvados-cwl-runner should reuse the most recent run
Actions #2

Updated by Jiayong Li almost 8 years ago

  • Description updated (diff)
Actions #3

Updated by Jiayong Li almost 8 years ago

  • Description updated (diff)
Actions #4

Updated by Jiayong Li almost 8 years ago

  • Description updated (diff)
Actions #5

Updated by Jiayong Li almost 8 years ago

  • Description updated (diff)
Actions #6

Updated by Tom Morris over 7 years ago

  • Assigned To set to Tom Morris
  • Target version set to Arvados Future Sprints
Actions #7

Updated by Peter Amstutz almost 4 years ago

  • Assigned To deleted (Tom Morris)
  • Status changed from New to Closed

The policy is that it reuses the earliest run, because that is stable. If it reused the latest run, it could change every time. If it is reusing a bad run, an admin can put the bad container in "cancelled" state to prevent it from being reused.

Actions #8

Updated by Ward Vandewege over 3 years ago

  • Target version deleted (Arvados Future Sprints)
Actions

Also available in: Atom PDF