https://dev.arvados.org/https://dev.arvados.org/favicon.ico?15576888422021-09-01T15:27:54ZArvadosArvados - Bug #18076: Properly handle cached user records that don't exist anymore on federation scenarioshttps://dev.arvados.org/issues/18076?journal_id=966842021-09-01T15:27:54ZPeter Amstutzpeter.amstutz@curii.com
<ul><li><strong>Target version</strong> set to <i>2021-09-15 sprint</i></li></ul> Arvados - Bug #18076: Properly handle cached user records that don't exist anymore on federation scenarioshttps://dev.arvados.org/issues/18076?journal_id=966852021-09-01T15:31:54ZPeter Amstutzpeter.amstutz@curii.com
<ul><li><strong>Assigned To</strong> set to <i>Lucas Di Pentima</i></li></ul> Arvados - Bug #18076: Properly handle cached user records that don't exist anymore on federation scenarioshttps://dev.arvados.org/issues/18076?journal_id=967022021-09-01T16:26:35ZLucas Di Pentimalucas.dipentima@curii.com
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>In Progress</i></li></ul> Arvados - Bug #18076: Properly handle cached user records that don't exist anymore on federation scenarioshttps://dev.arvados.org/issues/18076?journal_id=967132021-09-02T13:43:27ZLucas Di Pentimalucas.dipentima@curii.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-1 status-3 priority-4 priority-default closed parent" href="/issues/18094">Bug #18094</a>: Remove unused Users#update_uuid endpoint</i> added</li></ul> Arvados - Bug #18076: Properly handle cached user records that don't exist anymore on federation scenarioshttps://dev.arvados.org/issues/18076?journal_id=967252021-09-02T19:26:39ZLucas Di Pentimalucas.dipentima@curii.com
<ul></ul><p>Updates at <a class="changeset" title="18076: Fixes the bug, expands the test with additional checks. If there's a stale cached user re..." href="https://dev.arvados.org/projects/arvados/repository/arvados/revisions/7aa1380f8dbca657b347df513baca513c10650ba">7aa1380</a> - branch <code>18076-stale-cached-users-handling</code><br />Test run: <a class="external" href="https://ci.arvados.org/job/developer-run-tests/2670/"<a href="https://ci.arvados.org/job/developer-run-tests/2670/">developer-run-tests: #2670 <img src="https://ci.arvados.org/buildStatus/icon?job=developer-run-tests&build=2670" alt="" /></a></a></p>
<p>If there's a stale cached user record creating the <code>username</code> field collision, set it to <code>nil</code> before retrying the update. There're two scenarios this is covering:</p>
<ol>
<li>The stale user record belongs to an existing user on LoginCluster. Another user took its username, so the new username is coming in the <code>/arvados/v1/users/batch_update</code> operation -- it's ok to have it set to <code>nil</code> temporarily.</li>
<li>The stale user record doesn't exist on LoginCluster anymore, so resetting it to <code>nil</code> isn't harmful and avoids future collisions.</li>
</ol> Arvados - Bug #18076: Properly handle cached user records that don't exist anymore on federation scenarioshttps://dev.arvados.org/issues/18076?journal_id=967282021-09-02T19:49:36ZTom Cleggtom@curii.com
<ul></ul><p>The fix looks great.</p>
Just some nits about the test:
<ul>
<li><code>Limit: math.MaxInt64</code> can also be spelled <code>Limit: -1</code> (probably easier to read in logs)</li>
<li>The test first checks whether <em>all</em> clusters use z1111 as their login cluster, but all it really needs to be effective is for z1111 and z3333 to use z1111 as login cluster. (I figure eventually we'll add another cluster to our integration test suite with a different config, and at that point it will be nice to have fewer test failures to fix.)</li>
<li><code>for userNr := range []int{0, 1}</code> is sort of misleading in that userNr is the array index, not the array element -- <code>[]int{9,3}</code> would have the same result. Idiomatic Go is to say <code>for userNr := 0; userNr < 2; userNr++</code> if you want a sequence of numbers, or <code>for _, userNr := range ...</code> if you want specific int values. This is a very nitpicky nitpick though!</li>
</ul>
<p>LGTM, thanks!</p> Arvados - Bug #18076: Properly handle cached user records that don't exist anymore on federation scenarioshttps://dev.arvados.org/issues/18076?journal_id=967302021-09-02T20:10:24ZLucas Di Pentimalucas.dipentima@curii.com
<ul></ul><p>Thanks for the suggestions! Updated at <a class="changeset" title="18076: Improves test. Arvados-DCO-1.1-Signed-off-by: Lucas Di Pentima <lucas.dipentima@curii.com>" href="https://dev.arvados.org/projects/arvados/repository/arvados/revisions/c534e74c5c83f546c65e901f9c93ae2fdfae8d08">c534e74c5</a> and merged.</p> Arvados - Bug #18076: Properly handle cached user records that don't exist anymore on federation scenarioshttps://dev.arvados.org/issues/18076?journal_id=967432021-09-03T14:43:11ZLucas Di Pentimalucas.dipentima@curii.com
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Resolved</i></li></ul><p>Applied in changeset <a class="changeset" title="Merge branch '18076-stale-cached-users-handling' into main. Closes #18076 Arvados-DCO-1.1-Signed..." href="https://dev.arvados.org/projects/arvados/repository/arvados/revisions/726f65153228eb81aa526df078f4fca6045e1779">arvados|726f65153228eb81aa526df078f4fca6045e1779</a>.</p> Arvados - Bug #18076: Properly handle cached user records that don't exist anymore on federation scenarioshttps://dev.arvados.org/issues/18076?journal_id=968002021-09-06T13:44:03ZLucas Di Pentimalucas.dipentima@curii.com
<ul><li><strong>Status</strong> changed from <i>Resolved</i> to <i>In Progress</i></li></ul> Arvados - Bug #18076: Properly handle cached user records that don't exist anymore on federation scenarioshttps://dev.arvados.org/issues/18076?journal_id=968012021-09-06T14:04:38ZLucas Di Pentimalucas.dipentima@curii.com
<ul></ul><p>Once Jenkins deployed this on ce8i5, I started to see the following:</p>
<pre>
$ arv user list
Error: error updating local user records: //railsapi.internal/arvados/v1/users/batch_update: 422 Unprocessable Entity: #<ActiveRecord::RecordNotSaved: Failed to save the record> (req-vr2et8tnawgndjcxocwb)
</pre>
<p>...and in ce8i5's logs:</p>
<pre>
[req-vr2et8tnawgndjcxocwb] cached username 'ward2' collision with user '<a href="https://arvadosapi.com/ce8i5-tpzed-249xu4osx4us5lg">ce8i5-tpzed-249xu4osx4us5lg</a>' - resetting
#<ActiveRecord::RecordNotSaved: Failed to save the record>
Error 1630934673+e96dc935: 422
{"method":"PATCH","path":"/arvados/v1/users/batch_update","format":"html","controller":"Arvados::V1::UsersController","action":"batch_update","status":422,"duration":74.54,"view":0.32,"db":22.33,"request_id":null,"client_ipaddr":"127.0.0.1","client_auth":"<a href="https://arvadosapi.com/tordo-gj3su-000000000000000">tordo-gj3su-000000000000000</a>","exception":"#\u003cActiveRecord::RecordNotSaved: Failed to save the record\u003e","exception_backtrace":"/var/www/arvados-api/shared/vendor_bundle/ruby/2.5.0/gems/activerecord-5.2.6/lib/active_record/persistence.rb:308:in `save!'\n/var/www/arvados-api/shared/vendor_bundle/ruby/2.5.0/gems/activerecord-5.2.6/lib/active_record/validations.rb:52:in `save!'\n/var/www/arvados-api/shared/vendor_bundle/ruby/2.5.0/gems/activereco[...]
</pre>
<p>From reading the Rails docs, the <code>ActiveRecord::RecordNotSaved</code> exception is raised when a <code>before_*</code> callback fails when calling <code>save!()</code>, so I thought there was some issue with resetting the user's username... and when trying to do so manually, I got:</p>
<pre>
$ arv user update -b --uuid <a href="https://arvadosapi.com/ce8i5-tpzed-249xu4osx4us5lg">ce8i5-tpzed-249xu4osx4us5lg</a> --user='{"username": null}'
Error: //railsapi.internal/arvados/v1/users/ce8i5-tpzed-249xu4osx4us5lg: 422 Unprocessable Entity: Username can't be unset when the user owns repositories (req-wt8kut96pkj91l187pjn)
</pre>
<p>So I guess this is what's triggering the problem on the users model:</p>
<pre><code class="ruby syntaxhl"> <span class="n">before_update</span> <span class="ss">:verify_repositories_empty</span><span class="p">,</span> <span class="ss">:if</span> <span class="o">=></span> <span class="no">Proc</span><span class="p">.</span><span class="nf">new</span> <span class="p">{</span>
<span class="n">username</span><span class="p">.</span><span class="nf">nil?</span> <span class="n">and</span> <span class="n">username_changed?</span>
<span class="p">}</span>
</code></pre> Arvados - Bug #18076: Properly handle cached user records that don't exist anymore on federation scenarioshttps://dev.arvados.org/issues/18076?journal_id=968022021-09-06T14:21:39ZLucas Di Pentimalucas.dipentima@curii.com
<ul></ul><p>My approach would be to set the conflicting username to something random (maybe a random suffix?) instead of <code>nil</code>, as it seems that this wouldn't cause any issues with preexisting repositories.</p> Arvados - Bug #18076: Properly handle cached user records that don't exist anymore on federation scenarioshttps://dev.arvados.org/issues/18076?journal_id=968042021-09-06T17:46:36ZTom Cleggtom@curii.com
<ul></ul><p>Ugh! Yeah, a random string/suffix sounds like a decent workaround.</p> Arvados - Bug #18076: Properly handle cached user records that don't exist anymore on federation scenarioshttps://dev.arvados.org/issues/18076?journal_id=968092021-09-06T19:47:53ZLucas Di Pentimalucas.dipentima@curii.com
<ul></ul><p>Fix at <a class="changeset" title="18076: Fixes the bug by assigning a different (and random) username. If the renamed user does ex..." href="https://dev.arvados.org/projects/arvados/repository/arvados/revisions/7edaf06056ef65fad632955bf647f6c36024c1bc">7edaf0605</a> - branch <code>18076-user-cache-with-repository</code><br />Test run: <a class="external" href="https://ci.arvados.org/job/developer-run-tests/2676/"<a href="https://ci.arvados.org/job/developer-run-tests/2676/">developer-run-tests: #2676 <img src="https://ci.arvados.org/buildStatus/icon?job=developer-run-tests&build=2676" alt="" /></a></a><br />WB1 integration test re-run: <a class="external" href="https://ci.arvados.org/job/developer-run-tests-apps-workbench-integration/2837/"<a href="https://ci.arvados.org/job/developer-run-tests-apps-workbench-integration/2837/">developer-run-tests-apps-workbench-integration: #2837 <img src="https://ci.arvados.org/buildStatus/icon?job=developer-run-tests-apps-workbench-integration&build=2837" alt="" /></a></a></p>
<ul>
<li>Sets a "<code>conflictNNNNNNN</code>" random suffix to the conflicting user's username instead of <code>nil</code>.</li>
<li>Expands the previous test.</li>
</ul> Arvados - Bug #18076: Properly handle cached user records that don't exist anymore on federation scenarioshttps://dev.arvados.org/issues/18076?journal_id=968102021-09-07T13:32:49ZTom Cleggtom@curii.com
<ul></ul><p><a class="changeset" title="18076: Fixes the bug by assigning a different (and random) username. If the renamed user does ex..." href="https://dev.arvados.org/projects/arvados/repository/arvados/revisions/7edaf06056ef65fad632955bf647f6c36024c1bc">7edaf0605</a> LGTM, thanks!</p> Arvados - Bug #18076: Properly handle cached user records that don't exist anymore on federation scenarioshttps://dev.arvados.org/issues/18076?journal_id=968162021-09-07T14:01:28ZLucas Di Pentimalucas.dipentima@curii.com
<ul><li><strong>% Done</strong> changed from <i>50</i> to <i>100</i></li><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Resolved</i></li></ul><p>Applied in changeset arvados-private:commit:arvados|f860ff683634d73edd63df32c284aba42f1bbf0c.</p> Arvados - Bug #18076: Properly handle cached user records that don't exist anymore on federation scenarioshttps://dev.arvados.org/issues/18076?journal_id=985922021-11-16T16:30:40ZPeter Amstutzpeter.amstutz@curii.com
<ul><li><strong>Release</strong> set to <i>42</i></li></ul>