Idea #9369
closed[Documentation] Update documentation & tutorials for running CWL on Arvados
Updated by Brett Smith over 8 years ago
- Target version changed from 2016-06-22 sprint to 2016-07-06 sprint
Updated by Radhika Chippada over 8 years ago
- Target version changed from 2016-07-06 sprint to 2016-07-20 sprint
Updated by Brett Smith over 8 years ago
Reviewing 51b150b.
I pushed 3f69ccb to the branch with small copyedits throughout. Just the kind of stuff that's easier to commit than describe. Beyond that:
I don't think we should add the install instructions to the Python SDK page. I don't think readers would look for it here, because the fact that arvados-cwl-runner is written in Python should be an implementation detail (and I bet that will change in the future). If it's going to get added to a current page, I would advocate we add it to the CLI SDK install guide. No, that currently doesn't cover any Python installation, but that's a bug anyway.
arvados-cwl-runner
needs access to Docker. This isn't mentioned. This also puts us in the usual awkward position we have demoing Docker-based tools, where all the data is on qr1hi and we generally encourage people to run there, but since users don't get Docker access for security reasons they can't run the tools. I don't know how to square this circle.
In arv-copy --src cloud --dst settings 2463fa9efeb75e099685528b3b9071e0+438
, suggest changing cloud
to qr1hi
because that's how it's set up in the arv-copy documentation.
bwa-mem-template.yml
is executable but doesn't have a shebang line.
It's a small thing but we should probably change the 3d0ga
UUID prefix throughout to qr1hi
, for consistency with the rest of the documentation if nothing else.
Thanks.
Updated by Brett Smith over 8 years ago
Running the suggested cwltool
command gave me this error on shell.qr1hi:
Got workflow error Traceback (most recent call last): File "/data/scratch/brett/cwl/local/lib/python2.7/site-packages/cwltool/main.py", line 205, in single_job_executor for r in jobiter: File "/data/scratch/brett/cwl/local/lib/python2.7/site-packages/cwltool/draft2tool.py", line 290, in job j.command_line = flatten(map(builder.generate_arg, builder.bindings)) File "/data/scratch/brett/cwl/local/lib/python2.7/site-packages/cwltool/builder.py", line 162, in generate_arg value = self.do_eval(binding["do_eval"], context=value) File "/data/scratch/brett/cwl/local/lib/python2.7/site-packages/cwltool/builder.py", line 204, in do_eval timeout=self.timeout) File "/data/scratch/brett/cwl/local/lib/python2.7/site-packages/cwltool/expression.py", line 146, in do_eval timeout=timeout) File "/data/scratch/brett/cwl/local/lib/python2.7/site-packages/cwltool/sandboxjs.py", line 173, in interpolate e = execjs(scan[w[0]+1:w[1]], jslib, timeout=timeout) File "/data/scratch/brett/cwl/local/lib/python2.7/site-packages/cwltool/sandboxjs.py", line 79, in execjs raise JavascriptException(u"Long-running script killed after %s seconds.\nscript was:\n%s\n" % (timeout, fn_linenum())) JavascriptException: Long-running script killed after 20 seconds. script was: 0001 "use strict"; 0002 var inputs = { 0003 "reference": { 0004 "path": "/var/lib/cwl/job469616780_bwa-mem/19.fasta.bwt", 0005 "containerfs": true, 0006 "class": "File", 0007 "secondaryFiles": [ 0008 { 0009 "path": "/var/lib/cwl/job469616780_bwa-mem/19.fasta.ann", 0010 "containerfs": true, 0011 "class": "File" 0012 }, 0013 { 0014 "path": "/var/lib/cwl/job469616780_bwa-mem/19.fasta.amb", 0015 "containerfs": true, 0016 "class": "File" 0017 }, 0018 { 0019 "path": "/var/lib/cwl/job469616780_bwa-mem/19.fasta.pac", 0020 "containerfs": true, 0021 "class": "File" 0022 }, 0023 { 0024 "path": "/var/lib/cwl/job469616780_bwa-mem/19.fasta.sa", 0025 "containerfs": true, 0026 "class": "File" 0027 } 0028 ] 0029 }, 0030 "read_p1": { 0031 "path": "/var/lib/cwl/job469616780_bwa-mem/HWI-ST1027_129_D0THKACXX.1_1.fastq", 0032 "containerfs": true, 0033 "class": "File" 0034 }, 0035 "read_p2": { 0036 "path": "/var/lib/cwl/job469616780_bwa-mem/HWI-ST1027_129_D0THKACXX.1_2.fastq", 0037 "containerfs": true, 0038 "class": "File" 0039 }, 0040 "sample_id": "HWI-ST1027_129", 0041 "group_id": "arvados_tutorial", 0042 "PL": "illumina" 0043 }; 0044 var self = { 0045 "path": "/var/lib/cwl/job469616780_bwa-mem/19.fasta.bwt", 0046 "containerfs": true, 0047 "class": "File", 0048 "secondaryFiles": [ 0049 { 0050 "path": "/var/lib/cwl/job469616780_bwa-mem/19.fasta.ann", 0051 "containerfs": true, 0052 "class": "File" 0053 }, 0054 { 0055 "path": "/var/lib/cwl/job469616780_bwa-mem/19.fasta.amb", 0056 "containerfs": true, 0057 "class": "File" 0058 }, 0059 { 0060 "path": "/var/lib/cwl/job469616780_bwa-mem/19.fasta.pac", 0061 "containerfs": true, 0062 "class": "File" 0063 }, 0064 { 0065 "path": "/var/lib/cwl/job469616780_bwa-mem/19.fasta.sa", 0066 "containerfs": true, 0067 "class": "File" 0068 } 0069 ] 0070 }; 0071 var runtime = { 0072 "outdirSize": 1024, 0073 "ram": 1024, 0074 "tmpdirSize": 1024, 0075 "cores": 1, 0076 "tmpdir": "/tmp", 0077 "outdir": "/var/spool/cwl" 0078 }; 0079 (function(){return ((self.path.match(/(.*)\.[^.]+$/)[1]));})()
Updated by Peter Amstutz over 8 years ago
Brett Smith wrote:
Reviewing 51b150b.
I pushed 3f69ccb to the branch with small copyedits throughout. Just the kind of stuff that's easier to commit than describe. Beyond that:
I don't think we should add the install instructions to the Python SDK page. I don't think readers would look for it here, because the fact that arvados-cwl-runner is written in Python should be an implementation detail (and I bet that will change in the future). If it's going to get added to a current page, I would advocate we add it to the CLI SDK install guide. No, that currently doesn't cover any Python installation, but that's a bug anyway.
I moved the install instructions to the CLI page and also added some instructions about the Python SDK. Note that the Python SDK page also discusses mentions deb/rpm packages, but the CLI page does not.
arvados-cwl-runner
needs access to Docker. This isn't mentioned. This also puts us in the usual awkward position we have demoing Docker-based tools, where all the data is on qr1hi and we generally encourage people to run there, but since users don't get Docker access for security reasons they can't run the tools. I don't know how to square this circle.
I added a paragraph discussing it.
In
arv-copy --src cloud --dst settings 2463fa9efeb75e099685528b3b9071e0+438
, suggest changingcloud
toqr1hi
because that's how it's set up in the arv-copy documentation.
Done.
bwa-mem-template.yml
is executable but doesn't have a shebang line.
That wasn't intended to be executable, removed the x bit.
It's a small thing but we should probably change the
3d0ga
UUID prefix throughout toqr1hi
, for consistency with the rest of the documentation if nothing else.
Fixed.
JavascriptException: Long-running script killed after 20 seconds.
I added a section about what to do about that.
Updated by Brett Smith over 8 years ago
Pushed another round of commits in f17319c. Beyond that, all my comments are on the install section now.
- The CWL instructions are still on the Python SDK install page, and this feels a little incongruous to me. I get that the CWL stuff is in princple an SDK written in Python, so from that perspective it makes sense. But literally nothing else in the Python SDK documentation in this branch discusses the CWL SDK; and I don't think we're really expecting anyone to use the library as an SDK right now, just to use the included tools. As long as both of those remain true, I think it'd be more sensible from readers' perspective to leave the CWL SDK out of the Python SDK documentation completely.
- Why do the gem/pip shell prompts specify working in the Arvados source tree? (None of the shell in this section really follows the usual documentation convention for cd'ing around, but in general I'm fine to leave that as a separate issue, as long as the new stuff either follows convention correctly or just doesn't include paths like the rest of the page.)
- The CLI install instructions say
sudo -i python setup.py install
. This is dangerous because it's the kind of invocation that can cause setuptools to overwrite system packages. Per Paul Tagliamonte, who maintains a lot of Python Debian packages, "Use pip without sudo always. Don't tell people to use sudo."
The corresponding Python SDK install instructions don't use sudo. Neither should this. - The source instructions implicitly expect users to
cd
back to their home directory between installing each SDK component. It seems like it would be more user-friendly to make this more copy-paste friendly, by having commands likecd ../python
andcd ../cwl
.
Peter Amstutz wrote:
I moved the install instructions to the CLI page and also added some instructions about the Python SDK. Note that the Python SDK page also discusses mentions deb/rpm packages, but the CLI page does not.
This can be addressed separately, but is there any reason to believe this is anything besides an historical accident or oversight?
Updated by Peter Amstutz over 8 years ago
Brett Smith wrote:
Pushed another round of commits in f17319c. Beyond that, all my comments are on the install section now.
- The CWL instructions are still on the Python SDK install page, and this feels a little incongruous to me. I get that the CWL stuff is in princple an SDK written in Python, so from that perspective it makes sense. But literally nothing else in the Python SDK documentation in this branch discusses the CWL SDK; and I don't think we're really expecting anyone to use the library as an SDK right now, just to use the included tools. As long as both of those remain true, I think it'd be more sensible from readers' perspective to leave the CWL SDK out of the Python SDK documentation completely.
That was an oversight, I took out the other references but left one in. Fixed.
- Why do the gem/pip shell prompts specify working in the Arvados source tree? (None of the shell in this section really follows the usual documentation convention for cd'ing around, but in general I'm fine to leave that as a separate issue, as long as the new stuff either follows convention correctly or just doesn't include paths like the rest of the page.)
Copy and paste error, I think. Fixed.
- The CLI install instructions say
sudo -i python setup.py install
. This is dangerous because it's the kind of invocation that can cause setuptools to overwrite system packages. Per Paul Tagliamonte, who maintains a lot of Python Debian packages, "Use pip without sudo always. Don't tell people to use sudo."
I was copying the gem instructions which used sudo -i
(for consistency) but per our conversation on engineering I took it out.
The corresponding Python SDK install instructions don't use sudo. Neither should this.
- The source instructions implicitly expect users to
cd
back to their home directory between installing each SDK component. It seems like it would be more user-friendly to make this more copy-paste friendly, by having commands likecd ../python
andcd ../cwl
.
Fixed.
Peter Amstutz wrote:
I moved the install instructions to the CLI page and also added some instructions about the Python SDK. Note that the Python SDK page also discusses mentions deb/rpm packages, but the CLI page does not.
This can be addressed separately, but is there any reason to believe this is anything besides an historical accident or oversight?
No, I'm pretty sure it's just historical accident/oversight.
Updated by Brett Smith over 8 years ago
Pushed one last sentence structure copyedit at de4d6cb. That's good to merge. Thanks.
Updated by Peter Amstutz over 8 years ago
- Status changed from New to Resolved
- % Done changed from 50 to 100
Applied in changeset arvados|commit:931236cc8c9bea9c796aed23c1636abce3ee4534.