https://dev.arvados.org/https://dev.arvados.org/favicon.ico?15576888422016-08-02T18:28:55ZArvadosArvados - Idea #9684: [Crunch2] [API] Add Workflow resourcehttps://dev.arvados.org/issues/9684?journal_id=415532016-08-02T18:28:55ZTom Morristfmorris@veritasgenetics.com
<ul><li><strong>Target version</strong> set to <i>Arvados Future Sprints</i></li><li><strong>Story points</strong> set to <i>2.0</i></li></ul> Arvados - Idea #9684: [Crunch2] [API] Add Workflow resourcehttps://dev.arvados.org/issues/9684?journal_id=416292016-08-03T16:58:30ZBrett Smithbrett.smith@curii.com
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/41629/diff?detail_id=40327">diff</a>)</li></ul> Arvados - Idea #9684: [Crunch2] [API] Add Workflow resourcehttps://dev.arvados.org/issues/9684?journal_id=416302016-08-03T16:58:52ZBrett Smithbrett.smith@curii.com
<ul></ul><p>Updated with a little background from discussion yesterday, plus the specific names of the CWL fields we should be looking at.</p> Arvados - Idea #9684: [Crunch2] [API] Add Workflow resourcehttps://dev.arvados.org/issues/9684?journal_id=416322016-08-03T17:43:41ZPeter Amstutzpeter.amstutz@curii.com
<ul></ul><p>Suggest slightly different design:</p>
<p>Add "inputs" field which is the inputs section copied from the CWL file.<br />The "workflow" field is a keep file reference to the workflow file in the form <code>abcd+123/workflow.cwl</code></p> Arvados - Idea #9684: [Crunch2] [API] Add Workflow resourcehttps://dev.arvados.org/issues/9684?journal_id=416442016-08-03T19:07:17ZRadhika Chippadaradhika@curoverse.com
<ul><li><strong>Target version</strong> changed from <i>Arvados Future Sprints</i> to <i>2016-08-17 sprint</i></li></ul> Arvados - Idea #9684: [Crunch2] [API] Add Workflow resourcehttps://dev.arvados.org/issues/9684?journal_id=416512016-08-03T19:30:05ZRadhika Chippadaradhika@curoverse.com
<ul><li><strong>Assigned To</strong> set to <i>Radhika Chippada</i></li></ul> Arvados - Idea #9684: [Crunch2] [API] Add Workflow resourcehttps://dev.arvados.org/issues/9684?journal_id=417732016-08-08T20:03:16ZRadhika Chippadaradhika@curoverse.com
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>In Progress</i></li></ul> Arvados - Idea #9684: [Crunch2] [API] Add Workflow resourcehttps://dev.arvados.org/issues/9684?journal_id=418772016-08-15T14:31:16ZTom Cleggtom@curii.com
<ul></ul><p>at <a class="changeset" title="9684: add workflow resource to api server" href="https://dev.arvados.org/projects/arvados/repository/arvados/revisions/3723f697b61ce60858455473b3a5464a2da65bfb">3723f69</a> on 9684-workflows</p>
How are we handling YAML security problems?
<ul>
<li><a class="external" href="http://ruby-doc.org/stdlib-2.1.9/libdoc/yaml/rdoc/YAML.html">http://ruby-doc.org/stdlib-2.1.9/libdoc/yaml/rdoc/YAML.html</a>: "Do not use YAML to load untrusted data" </li>
<li><a class="external" href="https://github.com/dtao/safe_yaml">https://github.com/dtao/safe_yaml</a></li>
</ul>
<p>Is there some reason for the special treatment of whitespace-only strings? <code>" \t ".blank?</code> is true, but <code>" \t "</code> is not valid YAML according to YAML.load, so this exemption seems to let invalid YAML into the database. Can we allow nil, and let YAML.load decide everything else?</p>
<p>I don't think after_save is the right hook for the name/description sync -- the current code doesn't write the synchronized name/description to the database unless the caller hits "save" a second time. Suggest inserting "w.reload" after each "w.update_attributes!" in the unit test to expose this.</p>
In set_name_and_description:
<ul>
<li>Catching all exceptions and reporting them to the client isn't right. Instead, please check for error cases and return appropriate messages. (I can't tell which exceptions we're supposed to expect here...)</li>
<li><code>changes = self.changes</code> is mysterious. Why not just use <code>self.changes.include?(a)</code>?</li>
<li><code>need_save = false</code> is superfluous.</li>
<li><code>self.workflow_was.blank?</code> should be <code>.nil?</code></li>
<li>I suppose it works as <code>[]</code>, but I think <code>old_wf = {}</code> would be a bit clearer, given that we expect the YAML to express a hash when it is present.</li>
</ul>
<p>The "valid yaml" fixture should use some more realistic YAML, e.g., "name: foo\ndescription: bar\n"</p>
<p>Please add a unit test that creates a workflow with YAML that encodes something other than a hash, e.g., "foo" or "- foo\n- bar\n"</p> Arvados - Idea #9684: [Crunch2] [API] Add Workflow resourcehttps://dev.arvados.org/issues/9684?journal_id=419222016-08-16T03:29:00ZRadhika Chippadaradhika@curoverse.com
<ul></ul><blockquote>
<p>How are we handling YAML security problems ...</p>
</blockquote>
<p>Added 'safe_yaml' gem. According to the safe_yaml documentation, the default mode is 'safe' and hence didn't do anything else</p>
<blockquote>
<p>Is there some reason for the special treatment of whitespace-only strings? " \t ".blank? is true, but " \t " is not valid YAML according to YAML.load, so this exemption seems to let invalid YAML into the database. Can we allow nil, and let YAML.load decide everything else?</p>
</blockquote>
<p>Changed it accordingly</p>
<blockquote>
<p>I don't think after_save is the right hook for the name/description sync -- the current code doesn't write the synchronized name/description to the database unless the caller hits "save" a second time. Suggest inserting "w.reload" after each "w.update_attributes!" in the unit test to expose this.</p>
</blockquote>
<p>You are right. After verifying the test failures with w.reload, changed from after_save to before_save</p>
<blockquote>
<p>In set_name_and_description:</p>
</blockquote>
<blockquote><blockquote>
<p>Catching all exceptions and reporting them to the client isn't right. Instead, please check for error cases and return appropriate messages. (I can't tell which exceptions we're supposed to expect here...)</p>
</blockquote></blockquote>
<p>Since it appears that workflow_was yaml validation is the only possible exception in this case, updated to cache only this error</p>
<blockquote><blockquote>
<p>changes = self.changes is mysterious. Why not just use self.changes.include?(a)?</p>
</blockquote></blockquote>
<p>Updated</p>
<blockquote><blockquote>
<p>need_save = false is superfluous.</p>
</blockquote></blockquote>
<p>Yes. Removed</p>
<blockquote><blockquote>
<p>self.workflow_was.blank? should be .nil?</p>
</blockquote></blockquote>
<p>Updated</p>
<blockquote><blockquote>
<p>I suppose it works as [], but I think old_wf = {} would be a bit clearer, given that we expect the YAML to express a hash when it is present.</p>
</blockquote></blockquote>
<p>Updated</p>
<blockquote>
<p>The "valid yaml" fixture should use some more realistic YAML, e.g., "name: foo\ndescription: bar\n"</p>
</blockquote>
<p>Updated</p>
<blockquote>
<p>Please add a unit test that creates a workflow with YAML that encodes something other than a hash, e.g., "foo" or "- foo\n- bar\n"</p>
</blockquote>
<p>Added this test. Also added additional test assertions to check name and description when workflow yaml is unset.</p> Arvados - Idea #9684: [Crunch2] [API] Add Workflow resourcehttps://dev.arvados.org/issues/9684?journal_id=419712016-08-16T20:48:37ZTom Cleggtom@curii.com
<ul></ul><p>Hmm, does YAML.load actually raise ActiveRecord::RecordInvalid? Whatever it raises, the "rescue" should only cover the YAML.load line, not the whole function.</p>
<p>Also, if the old YAML is invalid we should probably just leave name and description alone and return true. Raising an error seems nice because it acknowledges the error condition, but it also makes it impossible to fix the broken item: "can't update record because something is wrong with the old version"...</p>
<p>The "is not valid yaml" error message probably shouldn't include the whole (non)YAML string -- that could be rather large. It would be nice to include the error provided by the YAML parser, though.</p>
<p>Errors should refer to the client-provided parameter name, not the name of the function doing the check. E.g.:<br /><pre><code class="diff syntaxhl"><span class="gd">-errors.add :validate_workflow, "#{...} is not valid yaml"
</span><span class="gi">+errors.add :workflow, "is not valid yaml: #{ ...error detail... }"
</span></code></pre></p> Arvados - Idea #9684: [Crunch2] [API] Add Workflow resourcehttps://dev.arvados.org/issues/9684?journal_id=419742016-08-16T21:11:11ZRadhika Chippadaradhika@curoverse.com
<ul></ul><p>Addressed those comments at <a class="changeset" title="9684: workflow yaml error logging" href="https://dev.arvados.org/projects/arvados/repository/arvados/revisions/35806ff1bb3106bcf5874f2b3956fb785acc852a">35806ff</a></p> Arvados - Idea #9684: [Crunch2] [API] Add Workflow resourcehttps://dev.arvados.org/issues/9684?journal_id=419822016-08-17T01:52:01ZRadhika Chippadaradhika@curoverse.com
<ul></ul><p>Tom: before we wrap this up, can you also please take a look at my note-6 on <a class="issue tracker-6 status-3 priority-4 priority-default closed parent" title="Idea: [Crunch2][Workbench] Extend pipeline template chooser to browse CWL workflows (Resolved)" href="https://dev.arvados.org/issues/9767#note-6">#9767-6</a> and let me know if we need to add any more info to the workflow objects?</p> Arvados - Idea #9684: [Crunch2] [API] Add Workflow resourcehttps://dev.arvados.org/issues/9684?journal_id=420712016-08-17T20:20:19ZTom Cleggtom@curii.com
<ul></ul><p>LGTM, thanks!</p> Arvados - Idea #9684: [Crunch2] [API] Add Workflow resourcehttps://dev.arvados.org/issues/9684?journal_id=420822016-08-18T11:55:06ZRadhika Chippadaradhika@curoverse.com
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Resolved</i></li></ul><p>Applied in changeset arvados|commit:2755c8e5060fc5f84c91e089b43d5019c84cd93b.</p> Arvados - Idea #9684: [Crunch2] [API] Add Workflow resourcehttps://dev.arvados.org/issues/9684?journal_id=420862016-08-18T13:49:20ZRadhika Chippadaradhika@curoverse.com
<ul></ul><p>The addition of safe_yaml gem requires the addition of an initializer to set deserialize_symbols: <a class="external" href="https://github.com/dtao/safe_yaml">https://github.com/dtao/safe_yaml</a></p>
<pre>SafeYAML::OPTIONS[:deserialize_symbols] = true</pre> Arvados - Idea #9684: [Crunch2] [API] Add Workflow resourcehttps://dev.arvados.org/issues/9684?journal_id=420962016-08-18T18:30:25ZRadhika Chippadaradhika@curoverse.com
<ul></ul><p>Deleted the initializer and instead update arvados_model -> recursive_stringify to convert ":foo" to "foo". Also, need an extra check to omit this for "::1" which was used by create_superuser_token.rb as the created_by_ip_address. (It appears that it might be desirable if recursive_stringify and has_symbols? in arvados_model.rb omitted checking a "value" of a "key" when the value is not map instead.)</p>