Story #7593

[CWL] Modernize CWL SDK

Added by Brett Smith about 6 years ago. Updated almost 6 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Start date:
10/16/2015
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
1.0

Subtasks

Task #7671: Review 7593-arvados-cwl-runner in arvados-devResolvedWard Vandewege

Task #7619: documentation updateResolvedPeter Amstutz

Task #7657: Review 7593-cwl-crunchrunnerResolvedPeter Amstutz

Task #7644: Update to latest cwltool & use crunchrunnerResolvedPeter Amstutz


Related issues

Related to Arvados - Bug #7654: [SDK] Python websockets client sometimes hangs on shutdownResolved11/19/2015

Associated revisions

Revision 6207681a
Added by Peter Amstutz almost 6 years ago

Merge branch '7593-cwl-crunchrunner' closes #7593

Revision acc98260
Added by Peter Amstutz almost 6 years ago

Merge branch '7593-arvados-cwl-runner' refs #7593

Revision acc98260
Added by Peter Amstutz almost 6 years ago

Merge branch '7593-arvados-cwl-runner' refs #7593

History

#1 Updated by Brett Smith about 6 years ago

  • Assigned To set to Peter Amstutz

#2 Updated by Peter Amstutz about 6 years ago

  • Story points set to 1.0

#3 Updated by Peter Amstutz almost 6 years ago

  • Status changed from New to In Progress

#4 Updated by Peter Amstutz almost 6 years ago

Not sure why, but I'm getting this intermittently:

Exception in thread WebSocketClient:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
    self.run()
  File "/usr/lib/python2.7/threading.py", line 763, in run
    self.__target(*self.__args, **self.__kwargs)
  File "build/bdist.linux-x86_64/egg/ws4py/websocket.py", line 427, in run
    if not self.once():
  File "build/bdist.linux-x86_64/egg/ws4py/websocket.py", line 300, in once
    b = self.sock.recv(self.reading_buffer_size)
AttributeError: 'NoneType' object has no attribute 'recv'

#5 Updated by Peter Amstutz almost 6 years ago

Two other problems:

a) cwl-runner sometimes fails to terminate even after reporting success (websockets problem?)

b) Sometimes arv-mount gets wedged during shutdown (could also be websockets?)

#6 Updated by Peter Amstutz almost 6 years ago

I believe the above problems are caused by #7654

#7 Updated by Brett Smith almost 6 years ago

  • Target version changed from 2015-10-28 sprint to 2015-11-11 sprint

#8 Updated by Brett Smith almost 6 years ago

Reviewing 2411624, and this is good. There's one small thing that needs to be addressed before you merge: the CWL SDK's dependency on arvados-python-client needs its version bumped, since it uses the new fnPattern argument of uploadfiles.

A couple of implementation points I was wondering about:

  • Why is the patternsegments[0] '.' case handled inside the filename iteration loop? Doesn't this mean that each time a collection has a non-matching filename, matches for the remaining segments will be added to ret? Or is that intended behavior for some reason I'm missing?
  • The way ArvCwlRunner.uploaded gets accessed directly by multiple other instances seems a little dangerous. Would it be a little safer, and prevent future trouble, if ArvCwlRunner grew two methods? One to record a new uploaded file, and another to return a new dictionary with all those records so far. I think that would make it much easier for future developers to do the right thing.

A few style points to take or leave as you see fit:

  • if len(patternsegments) 0 - PEP 8 prefers if not patternsegments.
  • I learned this myself recently: all the standard logger methods (.error, .warning, etc.) take an optional exc_info keyword. If that's truthy, the current exception's full traceback will be included in the log. Maybe that would be too verbose for your use case, but I wanted to let you know about it if it would be helpful for lines like logger.error("Got error %s" % str(e)). (BTW, calling str after using %s is redundant.)
  • Just FYI, dictionaries have a built-in copy() method. You don't need to pull in a whole module just to do that.

Thanks.

#9 Updated by Peter Amstutz almost 6 years ago

Brett Smith wrote:

Reviewing 2411624, and this is good. There's one small thing that needs to be addressed before you merge: the CWL SDK's dependency on arvados-python-client needs its version bumped, since it uses the new fnPattern argument of uploadfiles.

Fixed.

A couple of implementation points I was wondering about:

  • Why is the patternsegments[0] '.' case handled inside the filename iteration loop? Doesn't this mean that each time a collection has a non-matching filename, matches for the remaining segments will be added to ret? Or is that intended behavior for some reason I'm missing?

That's to enable a pattern like "./*.py" to match "foo.py", the pattern is split on '/' so that becomes a pattern ['.', '*.py'] matched againsted "foo.py" which means it tries to match "." against "foo.py" and fails. Now it shifts past '.' as essentially a no-op.

  • The way ArvCwlRunner.uploaded gets accessed directly by multiple other instances seems a little dangerous. Would it be a little safer, and prevent future trouble, if ArvCwlRunner grew two methods? One to record a new uploaded file, and another to return a new dictionary with all those records so far. I think that would make it much easier for future developers to do the right thing.

Done.

A few style points to take or leave as you see fit:

  • if len(patternsegments) 0 - PEP 8 prefers if not patternsegments.

Fixed.

  • I learned this myself recently: all the standard logger methods (.error, .warning, etc.) take an optional exc_info keyword. If that's truthy, the current exception's full traceback will be included in the log. Maybe that would be too verbose for your use case, but I wanted to let you know about it if it would be helpful for lines like logger.error("Got error %s" % str(e)). (BTW, calling str after using %s is redundant.)

Thanks, I have already used that in other places but not here.

  • Just FYI, dictionaries have a built-in copy() method. You don't need to pull in a whole module just to do that.

Fixed, thanks.

Now at 4c2e89e

I'm going to interpret your comment that it's okay to go ahead and merge so that it is at available for the CWL workshop & codefest.

#10 Updated by Peter Amstutz almost 6 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 25 to 100

Applied in changeset arvados|commit:6207681aa301ad12d164aab24f52a210945af04b.

#11 Updated by Peter Amstutz almost 6 years ago

  • Status changed from Resolved to In Progress

#12 Updated by Ward Vandewege almost 6 years ago

Reviewing 7593-arvados-cwl-runner on arvados-dev:

Just one question: this doesn't build a .deb for arvados-cwl-runner. Is that on purpose?

The python packaging parts LGTM.

#13 Updated by Peter Amstutz almost 6 years ago

Ward Vandewege wrote:

Reviewing 7593-arvados-cwl-runner on arvados-dev:

Just one question: this doesn't build a .deb for arvados-cwl-runner. Is that on purpose?

The future is still a little fuzzy on how this eventually gets deployed to production, so I don't know if we need OS packages yet. The goal was to get it into PyPi.

The python packaging parts LGTM.

I'll merge then, thanks.

#14 Updated by Peter Amstutz almost 6 years ago

  • Status changed from In Progress to Resolved

Also available in: Atom PDF