https://dev.arvados.org/
https://dev.arvados.org/favicon.ico?1557688842
2017-02-28T19:30:42Z
Arvados
Arvados - Bug #11168: [API] Use JSON instead of YAML for serialized fields in database
https://dev.arvados.org/issues/11168?journal_id=48770
2017-02-28T19:30:42Z
Tom Morris
tfmorris@veritasgenetics.com
<ul><li><strong>Target version</strong> set to <i>2017-03-15 sprint</i></li><li><strong>Story points</strong> set to <i>2.0</i></li></ul>
Arvados - Bug #11168: [API] Use JSON instead of YAML for serialized fields in database
https://dev.arvados.org/issues/11168?journal_id=48777
2017-02-28T19:58:06Z
Javier BĂ©rtoli
jbertoli@curii.com
<ul><li><strong>Subject</strong> changed from <i>[API] Use JSON instead of YAML for serialized fields in database</i> to <i>[API] Use JSON instead of YAML for serialized fields in databasethis </i></li></ul><p>This sounds like a nice improvement. That PG page has some warnings to which we should need to probably pay careful attention:</p>
<ul>
<li>difference in usage of <strong>json/jsonb</strong>, and which one is better depending on the JSON content we want to store.</li>
<li>"...it is best to avoid mixing Unicode escapes in JSON with a non-UTF8 database encoding, if possible.". China's default encoding is <strong>UTF-16</strong>, not UTF-8, so I don't know if/how that will have any impact in this change, but I think is something to take into consideration.</li>
</ul>
Arvados - Bug #11168: [API] Use JSON instead of YAML for serialized fields in database
https://dev.arvados.org/issues/11168?journal_id=48870
2017-03-01T20:25:48Z
Tom Clegg
tom@curii.com
<ul><li><strong>Subject</strong> changed from <i>[API] Use JSON instead of YAML for serialized fields in databasethis </i> to <i>[API] Use JSON instead of YAML for serialized fields in database</i></li></ul>
Arvados - Bug #11168: [API] Use JSON instead of YAML for serialized fields in database
https://dev.arvados.org/issues/11168?journal_id=48900
2017-03-01T20:47:07Z
Tom Clegg
tom@curii.com
<ul><li><strong>Assigned To</strong> set to <i>Tom Clegg</i></li></ul>
Arvados - Bug #11168: [API] Use JSON instead of YAML for serialized fields in database
https://dev.arvados.org/issues/11168?journal_id=48921
2017-03-01T21:06:41Z
Tom Clegg
tom@curii.com
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>In Progress</i></li></ul>
Arvados - Bug #11168: [API] Use JSON instead of YAML for serialized fields in database
https://dev.arvados.org/issues/11168?journal_id=48922
2017-03-01T21:24:18Z
Tom Clegg
tom@curii.com
<ul></ul><p>11168-serialize-json @ <a class="changeset" title="11168: Change db serialize from YAML to JSON." href="https://dev.arvados.org/projects/arvados/repository/arvados/revisions/d7c84d69bb62d61bc671b2d5e0ad4ed42dbeb7c0">d7c84d69bb62d61bc671b2d5e0ad4ed42dbeb7c0</a></p>
Arvados - Bug #11168: [API] Use JSON instead of YAML for serialized fields in database
https://dev.arvados.org/issues/11168?journal_id=48923
2017-03-01T23:00:51Z
Tom Clegg
tom@curii.com
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/48923/diff?detail_id=47115">diff</a>)</li></ul>
Arvados - Bug #11168: [API] Use JSON instead of YAML for serialized fields in database
https://dev.arvados.org/issues/11168?journal_id=48924
2017-03-01T23:04:53Z
Tom Clegg
tom@curii.com
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/48924/diff?detail_id=47116">diff</a>)</li></ul>
Arvados - Bug #11168: [API] Use JSON instead of YAML for serialized fields in database
https://dev.arvados.org/issues/11168?journal_id=48928
2017-03-02T15:42:00Z
Peter Amstutz
peter.amstutz@curii.com
<ul></ul><p>11168-serialize-json @ <a class="changeset" title="11168: Prohibit down-migration to YAML-only codebase." href="https://dev.arvados.org/projects/arvados/repository/arvados/revisions/766ddd6d958826049c2811f6d058480246e423a6">766ddd6</a></p>
<p>I notice that a couple of places have been switched to use SafeJSON, but they still have <code>require 'oj'</code>. The specific instances are eventbus.rb and websocket_test.rb. <code>eventbus.rb</code> catches <code>Oj::Error</code>, <code>websocket_test.rb</code> doesn't appear to have any remaining instances of Oj.</p>
<p>I believe the change to <code>Job.sorted_hash_digest</code> may prevent job reuse unless we check for the hashes of both the YAML and JSON serializations.</p>
<p>Should <code>deep_sort_hash</code> happen in <code>where_serialized</code> ? It doesn't particularly make sense to query the string value of a serialized hashed column without sorting it first.</p>
<p>In create_superuser_token_test, the test "existing token has limited scope":</p>
<pre>
- update_all(scopes: ["GET /"])
+ update_all(scopes: SafeJSON.dump(["GET /"]))
</pre>
<p>Why/how did this work before, and why does it need to be manually serialized now?</p>
<p>I did some manual verification:</p>
<ol>
<li>Looked at "properties" column of "logs" table in pqsl</li>
<li>The earliest log item was previously serialized to postgres as YAML</li>
<li>The latest log item is serialized to postgres as JSON</li>
<li>Both earliest and latest log records can be accessed via API and are reported as JSON.</li>
</ol>
Arvados - Bug #11168: [API] Use JSON instead of YAML for serialized fields in database
https://dev.arvados.org/issues/11168?journal_id=48946
2017-03-03T05:43:25Z
Tom Clegg
tom@curii.com
<ul></ul><blockquote>
<p>I notice that a couple of places have been switched to use SafeJSON, but they still have <code>require 'oj'</code>. The specific instances are eventbus.rb and websocket_test.rb. <code>eventbus.rb</code> catches <code>Oj::Error</code>, <code>websocket_test.rb</code> doesn't appear to have any remaining instances of Oj.</p>
</blockquote>
<p>Indeed. Removed import from websocket_test.rb, thanks.</p>
<blockquote>
<p>I believe the change to <code>Job.sorted_hash_digest</code> may prevent job reuse unless we check for the hashes of both the YAML and JSON serializations.</p>
</blockquote>
<p>Ah, good catch. Both old and new are JSON, but Oj.dump(h) serializes {"foo":"bar"} as {":foo":"bar"} so changing to compat mode without a migration would break reuse of jobs saved by old versions. I suppose "symbol" mode is fine for this as long as we keep doing it. Reverted.</p>
<blockquote>
<p>Should <code>deep_sort_hash</code> happen in <code>where_serialized</code> ? It doesn't particularly make sense to query the string value of a serialized hashed column without sorting it first.</p>
</blockquote>
<p>Yes, good point. Moved into where_serialized.</p>
<blockquote>
<p>In create_superuser_token_test, the test "existing token has limited scope":</p>
<p>update_all(scopes: ["GET /"])</p>
<p>Why/how did this work before, and why does it need to be manually serialized now?</p>
</blockquote>
<p>Well, it turns out it didn't work all that well before:</p>
<pre>
======================================================================
CreateSuperUserTokenTest#test_existing_token_has_limited_scope
----------------------------------------------------------------------
ApiClientAuthorization Load (0.3ms) SELECT "api_client_authorizations".* FROM "api_client_authorizations" WHERE "api_client_authorizations"."id" = $1 LIMIT 1 [["id", 279786541]]
SQL (0.4ms) UPDATE "api_client_authorizations" SET "scopes" = 'GET /' WHERE "api_client_authorizations"."user_id" = 476014017
</pre>
<p>But the purpose of this statement was just to sabotage the test fixture so actual≠desired scopes, and invalid≠desired, so the test passed.</p>
<p>Now it does what it looks like it does:</p>
<pre>
SQL (0.4ms) UPDATE "api_client_authorizations" SET "scopes" = '["GET /"]' WHERE "api_client_authorizations"."user_id" = 476014017
</pre>
<p>(update_all() is just a DB query, it bypasses model logic.)</p>
<p>now at <a class="changeset" title="11168: Always deep-sort before comparing in where_serialized." href="https://dev.arvados.org/projects/arvados/repository/arvados/revisions/594e00f9311da95f73843f55b6e1c7c3ad55d8df">594e00f9311da95f73843f55b6e1c7c3ad55d8df</a></p>
Arvados - Bug #11168: [API] Use JSON instead of YAML for serialized fields in database
https://dev.arvados.org/issues/11168?journal_id=48953
2017-03-03T16:30:58Z
Peter Amstutz
peter.amstutz@curii.com
<ul></ul><p>LGTM @ <a class="changeset" title="11168: Always deep-sort before comparing in where_serialized." href="https://dev.arvados.org/projects/arvados/repository/arvados/revisions/594e00f9311da95f73843f55b6e1c7c3ad55d8df">594e00f</a></p>
Arvados - Bug #11168: [API] Use JSON instead of YAML for serialized fields in database
https://dev.arvados.org/issues/11168?journal_id=48957
2017-03-03T16:45:59Z
Peter Amstutz
peter.amstutz@curii.com
<ul></ul><p>Hold on, now I'm having trouble starting the API server in arvbox. I don't know if something is just corrupted/confused or there's a real problem here:</p>
<pre>
2017-03-03_16:43:49.18377 Job Load (0.6ms) SELECT "jobs".* FROM "jobs" WHERE (state = 'Queued') ORDER BY priority desc, created_at
2017-03-03_16:43:49.18440 (0.5ms) SELECT COUNT(*) FROM "pipeline_instances" WHERE (state = 'RunningOnServer')
2017-03-03_16:43:49.57545 ApiClientAuthorization Load (0.7ms) SELECT "api_client_authorizations".* FROM "api_client_authorizations" WHERE (api_token='4ao313k81hkmc1812j4eo25uf5p
3wlmc78edkkkpqnrhf8bwvb' and (expires_at is null or expires_at > CURRENT_TIMESTAMP)) LIMIT 1
2017-03-03_16:43:49.57617 App 4431 stderr: [ 2017-03-03 16:43:49.5759 4545/0x0055cfdc0e9b40(Worker 1) utils.rb:87 ]: *** Exception RuntimeError in Rack application object (invalid
serialized data "\"[\\\"al") (process 4545, thread 0x0055cfdc0e9b40(Worker 1)):
2017-03-03_16:43:49.57619 App 4431 stderr: from /usr/src/arvados/services/api/lib/serializers.rb:32:in `load'
2017-03-03_16:43:49.57619 App 4431 stderr: from /var/lib/gems/ruby/2.1.0/gems/activerecord-3.2.22.5/lib/active_record/attribute_methods/serialization.rb:24:in `unserialize'
2017-03-03_16:43:49.57619 App 4431 stderr: from /var/lib/gems/ruby/2.1.0/gems/activerecord-3.2.22.5/lib/active_record/attribute_methods/serialization.rb:15:in `unserialized_value'
2017-03-03_16:43:49.57620 App 4431 stderr: from /var/lib/gems/ruby/2.1.0/gems/activerecord-3.2.22.5/lib/active_record/attribute_methods/read.rb:84:in `__temp__'
2017-03-03_16:43:49.57620 App 4431 stderr: from /var/lib/gems/ruby/2.1.0/gems/activerecord-3.2.22.5/lib/active_record/attribute_methods/read.rb:46:in `type_cast_attribute'
2017-03-03_16:43:49.57620 App 4431 stderr: from /var/lib/gems/ruby/2.1.0/gems/activerecord-3.2.22.5/lib/active_record/attribute_methods/read.rb:127:in `read_attribute'
2017-03-03_16:43:49.57621 App 4431 stderr: from /var/lib/gems/ruby/2.1.0/gems/activerecord-3.2.22.5/lib/active_record/attribute_methods.rb:185:in `block in attributes'
2017-03-03_16:43:49.57621 App 4431 stderr: from /var/lib/gems/ruby/2.1.0/gems/activerecord-3.2.22.5/lib/active_record/attribute_methods.rb:185:in `each'
2017-03-03_16:43:49.57621 App 4431 stderr: from /var/lib/gems/ruby/2.1.0/gems/activerecord-3.2.22.5/lib/active_record/attribute_methods.rb:185:in `attributes'
2017-03-03_16:43:49.57621 App 4431 stderr: from /usr/src/arvados/services/api/app/models/arvados_model.rb:498:in `block in convert_serialized_symbols_to_strings'
2017-03-03_16:43:49.57622 App 4431 stderr: from /usr/src/arvados/services/api/app/models/arvados_model.rb:497:in `each'
2017-03-03_16:43:49.57622 App 4431 stderr: from /usr/src/arvados/services/api/app/models/arvados_model.rb:497:in `convert_serialized_symbols_to_strings'
2017-03-03_16:43:49.57622 App 4431 stderr: from /var/lib/gems/ruby/2.1.0/gems/activesupport-3.2.22.5/lib/active_support/callbacks.rb:405:in `_run__1338818612019012701__find__4105550916150344441__callbacks'
2017-03-03_16:43:49.57623 App 4431 stderr: from /var/lib/gems/ruby/2.1.0/gems/activesupport-3.2.22.5/lib/active_support/callbacks.rb:405:in `__run_callback'
2017-03-03_16:43:49.57623 App 4431 stderr: from /var/lib/gems/ruby/2.1.0/gems/activesupport-3.2.22.5/lib/active_support/callbacks.rb:385:in `_run_find_callbacks'
2017-03-03_16:43:49.57623 App 4431 stderr: from /var/lib/gems/ruby/2.1.0/gems/activesupport-3.2.22.5/lib/active_support/callbacks.rb:81:in `run_callbacks'
2017-03-03_16:43:49.57623 App 4431 stderr: from /var/lib/gems/ruby/2.1.0/gems/activerecord-3.2.22.5/lib/active_record/base.rb:523:in `init_with'
2017-03-03_16:43:49.57624 App 4431 stderr: from /var/lib/gems/ruby/2.1.0/gems/activerecord-3.2.22.5/lib/active_record/inheritance.rb:68:in `instantiate'
2017-03-03_16:43:49.57624 App 4431 stderr: from /var/lib/gems/ruby/2.1.0/gems/activerecord-3.2.22.5/lib/active_record/querying.rb:38:in `block (2 levels) in find_by_sql'
2017-03-03_16:43:49.57624 App 4431 stderr: from /var/lib/gems/ruby/2.1.0/gems/activerecord-3.2.22.5/lib/active_record/querying.rb:38:in `collect!'
2017-03-03_16:43:49.57625 App 4431 stderr: from /var/lib/gems/ruby/2.1.0/gems/activerecord-3.2.22.5/lib/active_record/querying.rb:38:in `block in find_by_sql'
2017-03-03_16:43:49.57626 App 4431 stderr: from /var/lib/gems/ruby/2.1.0/gems/activerecord-3.2.22.5/lib/active_record/explain.rb:41:in `logging_query_plan'
2017-03-03_16:43:49.57626 App 4431 stderr: from /var/lib/gems/ruby/2.1.0/gems/activerecord-3.2.22.5/lib/active_record/querying.rb:37:in `find_by_sql'
2017-03-03_16:43:49.57626 App 4431 stderr: from /var/lib/gems/ruby/2.1.0/gems/activerecord-3.2.22.5/lib/active_record/relation.rb:171:in `exec_queries'
2017-03-03_16:43:49.57626 App 4431 stderr: from /var/lib/gems/ruby/2.1.0/gems/activerecord-3.2.22.5/lib/active_record/relation.rb:160:in `block in to_a'
2017-03-03_16:43:49.57626 App 4431 stderr: from /var/lib/gems/ruby/2.1.0/gems/activerecord-3.2.22.5/lib/active_record/explain.rb:41:in `logging_query_plan'
2017-03-03_16:43:49.57627 App 4431 stderr: from /var/lib/gems/ruby/2.1.0/gems/activerecord-3.2.22.5/lib/active_record/relation.rb:159:in `to_a'
2017-03-03_16:43:49.57627 App 4431 stderr: from /var/lib/gems/ruby/2.1.0/gems/activerecord-3.2.22.5/lib/active_record/relation/finder_methods.rb:381:in `find_first'
2017-03-03_16:43:49.57627 App 4431 stderr: from /var/lib/gems/ruby/2.1.0/gems/activerecord-3.2.22.5/lib/active_record/relation/finder_methods.rb:122:in `first'
2017-03-03_16:43:49.57628 App 4431 stderr: from /usr/src/arvados/services/api/app/middlewares/arvados_api_token.rb:39:in `call'
</pre>
Arvados - Bug #11168: [API] Use JSON instead of YAML for serialized fields in database
https://dev.arvados.org/issues/11168?journal_id=48958
2017-03-03T16:59:15Z
Peter Amstutz
peter.amstutz@curii.com
<ul></ul><pre>
arvados_development=# SELECT "api_client_authorizations".scopes FROM "api_client_authorizations" WHERE (api_token='4ao313k81hkmc1812j4eo25uf5p3wlmc78edkkkpqnrhf8bwvb' and (expires_at is null or expires_at > CURRENT_TIMESTAMP)) LIMIT 1;
scopes
-------------
"[\"all\"]"
(1 row)
</pre>
<p>Hmm?</p>
Arvados - Bug #11168: [API] Use JSON instead of YAML for serialized fields in database
https://dev.arvados.org/issues/11168?journal_id=48959
2017-03-03T17:02:24Z
Peter Amstutz
peter.amstutz@curii.com
<ul></ul><pre>
arvados_development=# SELECT id, scopes FROM "api_client_authorizations";
id | scopes
----+-------------
1 | "[\"all\"]"
2 | ["all"]
</pre>
Arvados - Bug #11168: [API] Use JSON instead of YAML for serialized fields in database
https://dev.arvados.org/issues/11168?journal_id=48960
2017-03-03T18:22:17Z
Peter Amstutz
peter.amstutz@curii.com
<ul></ul><pre>
id | api_token | api_client_id | user_id | created_by_ip_address | last_used_by_ip_address | last_used_at | expires_at | created_at | updated_at | default_owner_uuid | scopes | uuid
----+----------------------------------------------------+---------------+---------+-----------------------+-------------------------+----------------------------+------------+----------------------------+----------------------------+--------------------+-------------+-----------------------------
1 | 4ao313k81hkmc1812j4eo25uf5p3wlmc78edkkkpqnrhf8bwvb | 1 | 1 | ::1 | 192.168.5.3 | 2017-03-03 16:43:31.951122 | | 2017-02-24 16:08:41.941389 | 2017-03-03 16:43:32.971805 | | "[\"all\"]" | <a href="https://arvadosapi.com/34t0i-gj3su-28nah1lzkhpalqb">34t0i-gj3su-28nah1lzkhpalqb</a>
2 | mdvgi7g61ec6gshnwlhi92icny27map7k5o3nghnz18j82l92 | 2 | 3 | 192.168.5.1 | 192.168.5.2 | 2017-03-02 15:27:32.650515 | | 2017-03-02 15:21:51.147115 | 2017-03-02 15:27:32.651099 | | ["all"] | <a href="https://arvadosapi.com/34t0i-gj3su-u4yxlmib581etwd">34t0i-gj3su-u4yxlmib581etwd</a>
(2 rows)
</pre>
Arvados - Bug #11168: [API] Use JSON instead of YAML for serialized fields in database
https://dev.arvados.org/issues/11168?journal_id=48963
2017-03-03T21:23:34Z
Tom Clegg
tom@curii.com
<ul></ul><p><a class="changeset" title="11168: Double-decode serialized fields if database was mangled by downgraded API server." href="https://dev.arvados.org/projects/arvados/repository/arvados/revisions/07e4083ea451913b988d77e8e4c926da8ad844a4">07e4083ea451913b988d77e8e4c926da8ad844a4</a></p>
<p>"Double-decode serialized fields if database was mangled by downgraded API server."</p>
Arvados - Bug #11168: [API] Use JSON instead of YAML for serialized fields in database
https://dev.arvados.org/issues/11168?journal_id=48966
2017-03-03T21:50:05Z
Tom Clegg
tom@curii.com
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Resolved</i></li></ul><p>Applied in changeset arvados|commit:660a6143ecf1e777f33bd84183ba9e821e1d7a8e.</p>