Project

General

Profile

Actions

Bug #20531

open

When getting logs from child job to print error, distinguish between job couldn't start and non-zero exit

Added by Peter Amstutz 10 days ago. Updated about 4 hours ago.

Status:
In Progress
Priority:
Normal
Assigned To:
Category:
CWL
Target version:
Start date:
05/30/2023
Due date:
% Done:

0%

Estimated time:
(Total: 0.00 h)
Story points:
0.5
Release relationship:
Auto

Description

  • When the job failed with a non-zero exit code, just print stdout/stderr, no crunch-run
  • When the job failed to start, only print crunch-run (we won't have stdout/stderr anyway)
  • When the job failed from a signal (exit code > 127), print both

Subtasks 1 (1 open0 closed)

Task #20563: Review 20531-cwl-log-tailIn ProgressBrett Smith05/30/2023

Actions
Actions #1

Updated by Peter Amstutz 10 days ago

  • Description updated (diff)
Actions #4

Updated by Peter Amstutz 7 days ago

  • Story points set to 0.5
  • Target version changed from To be groomed to Development 2023-06-07
  • Assigned To set to Peter Amstutz
Actions #5

Updated by Peter Amstutz 7 days ago

  • Status changed from New to In Progress
  • Subject changed from When getting logs from child job to print error, distinguish between job couldn't start and non-zero exit to When getting logs from child job to print error, distinguish between job couldn't start and non-zero exit
Actions #7

Updated by Peter Amstutz about 10 hours ago

  • Release set to 64
Actions #8

Updated by Brett Smith about 10 hours ago

Peter Amstutz wrote in #note-6:

20531-cwl-log-tail @ 3c0b0c7d34d015fc6d4838cd7fa00a729c1898bb

The code is fine. Can you please add a lot more comments explaining the issue that's being fixed by the urllib3 requirement? This kind of stuff gets lost to the sands of time really quickly, so it would help future us a lot if we could look at this in a year or five and know whether the issue is still relevant. I took a quick skim of the requirements for both cwl-utils and cwltool and didn't see a direct urllib3 requirement, so while I don't doubt there's a real issue it's not even clear to me right now what it is. Thanks.

Actions #9

Updated by Peter Amstutz about 6 hours ago

Brett Smith wrote in #note-8:

Peter Amstutz wrote in #note-6:

20531-cwl-log-tail @ 3c0b0c7d34d015fc6d4838cd7fa00a729c1898bb

The code is fine. Can you please add a lot more comments explaining the issue that's being fixed by the urllib3 requirement? This kind of stuff gets lost to the sands of time really quickly, so it would help future us a lot if we could look at this in a year or five and know whether the issue is still relevant. I took a quick skim of the requirements for both cwl-utils and cwltool and didn't see a direct urllib3 requirement, so while I don't doubt there's a real issue it's not even clear to me right now what it is. Thanks.

The basic problem is that installing with a version of pip with the old resolver, common situation is that package X has an open ended dependency on Z, and package Y has a closed dependency on Z, but if package X is installed first, it won't downgrade to the version required by Y, so at runtime you get a dependency conflict between Y and Z.

pip's newer resolver is smarter, but lots of end users are on older distros with older version of pip.

So this was attempting to solve one of those situations by lifting the version constraint to the top-level package.

I think the specific situation involved cwl-utils having a constraint 'urllib3 < 2' but something else installed 'requests' with an open-ended version spec and as a result a version a later version of 'urllib3' was installed first.

Turns out as of this writing this was just fixed in upstream in the last 8 hours:

https://github.com/common-workflow-language/cwl-utils/commit/7af859de43d3983d45d8bb432d7ae6b715265efe

So I think the original change is no longer needed and can be reverted.

Meanwhile, a new version of cwltest was released which breaks compatibility.

https://github.com/common-workflow-language/cwltest/releases/tag/2.3.20230527113600

So I had to put in a version pin to hopefully fix testing, see note.

Actions #11

Updated by Brett Smith about 4 hours ago

Peter Amstutz wrote in #note-10:

20531-cwl-log-tail @ 72b8fdfb21636030f2e110eb5850beb5f08bb930

Thanks for explaining, that "fixed earlier today" thing helps me feel better about why I couldn't find the connection. What's in the branch looks good to me, but it looks like something might've changed recently to cause that test failure in Jenkins?

Actions

Also available in: Atom PDF