Bug #11547

[API] Excessive memory and time consumed by API and dispatch when container fields (like "mounts") are large

Added by Peter Amstutz over 1 year ago. Updated 5 months ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:
Story points:
-

Description

arvados-cwl-runner created a container request that was about 75 MB of JSON. This caused huge memory bloat in the API server, paralyzed requests to both the containers and container_requests table (details on #11546), and resulted in memory bloat of other services as well, such as crunch-dispatch-slurm:

$ ps axv | grep crunch-dispatch-slurm
  PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
43366 ?        Sl    34:03   2248     0 3676584 2918600 10.1 /usr/bin/crunch-dispatch-slurm

crunch-dispatch-slurm does not use the "mounts" field, so it should not load it (or any other potentially large field it doesn't use).

The API server should use a more efficient procedure to achieve atomic "lock" and "unlock" (currently this process involves an unnecessary ActiveRecord load/save cycle).

If the API is still very slow after these changes, determine what makes it slow, and add some limits with suitable defaults.

In any case, arvados-cwl-runner should not produce huge container requests.

current (27 KB) current Peter Amstutz, 04/25/2017 08:38 PM
huge.py (525 Bytes) huge.py Peter Amstutz, 04/27/2017 08:37 PM

Related issues

Related to Arvados - Bug #11548: workbench needs to handle API server timeouts betterNew

Related to Arvados - Bug #11546: Expensive container lock/unlockResolved2017-05-17

Related to Arvados - Bug #11549: [CWL] Avoid mounting individual filesResolved2017-04-27

History

#1 Updated by Peter Amstutz over 1 year ago

#2 Updated by Ward Vandewege over 1 year ago

Stopping and starting:

2017-04-25_20:41:22.96814 2017/04/25 20:41:22 Error running "/usr/bin/squeue" ["squeue" "--all" "--format=%j"]: fork/exec /usr/bin/squeue: cannot allocate memory "" 
2017-04-25_20:41:23.94909 2017/04/25 20:41:23 Error running "/usr/bin/squeue" ["squeue" "--all" "--format=%j"]: fork/exec /usr/bin/squeue: cannot allocate memory "" 
2017-04-25_20:41:28.64481 2017/04/25 20:41:28 Error getting list of containers: "unexpected EOF" 
2017-04-25_20:45:31.94756 2017/04/25 20:45:31 Error running "/usr/bin/squeue" ["squeue" "--all" "--format=%j"]: fork/exec /usr/bin/squeue: cannot allocate memory "" 
2017-04-25_20:45:33.94237 2017/04/25 20:45:33 Error running "/usr/bin/squeue" ["squeue" "--all" "--format=%j"]: fork/exec /usr/bin/squeue: cannot allocate memory "" 
2017-04-25_20:45:34.94765 2017/04/25 20:45:34 Error running "/usr/bin/squeue" ["squeue" "--all" "--format=%j"]: fork/exec /usr/bin/squeue: cannot allocate memory "" 
2017-04-25_20:45:37.96405 2017/04/25 20:45:37 Error running "/usr/bin/squeue" ["squeue" "--all" "--format=%j"]: fork/exec /usr/bin/squeue: cannot allocate memory "" 
2017-04-25_20:45:40.97709 2017/04/25 20:45:40 Submitting container qr1hi-dz642-itcctbtq2z1lb8t to slurm
2017-04-25_20:45:40.97732 2017/04/25 20:45:40 exec sbatch ["sbatch" "--share" "--workdir=/tmp" "--job-name=qr1hi-dz642-itcctbtq2z1lb8t" "--mem-per-cpu=3750" "--cpus-per-task=16"]
2017-04-25_20:46:09.03406 Stopping crunch-dispatch-slurm
2017-04-25_20:47:37.27458 Starting crunch-dispatch-slurm from /etc/sv/crunch-dispatch-slurm
2017-04-25_20:47:38.40428 2017/04/25 20:47:38 Start monitoring container qr1hi-dz642-khl9hd88hk88a7m
$ date; ps axv | grep crunch-dispatch-slurm
Tue Apr 25 20:48:36 UTC 2017
51812 ?        Sl     0:03      0  2362 860705 315600  1.0 /usr/bin/crunch-dispatch-slurm

#3 Updated by Ward Vandewege over 1 year ago

Looks like things started going haywire around here:

2017-04-25_18:14:20.89864 2017/04/25 18:14:20 sbatch succeeded: "Submitted batch job 21300" 
2017-04-25_18:14:20.89867 2017/04/25 18:14:20 Start monitoring container qr1hi-dz642-khl9hd88hk88a7m
2017-04-25_18:32:22.91051 2017/04/25 18:32:22 Submitting container qr1hi-dz642-itcctbtq2z1lb8t to slurm
2017-04-25_18:32:22.91082 2017/04/25 18:32:22 exec sbatch ["sbatch" "--share" "--workdir=/tmp" "--job-name=qr1hi-dz642-itcctbtq2z1lb8t" "--mem-per-cpu=3750" "--cpus-per-task=16"]
2017-04-25_18:33:09.22483 2017/04/25 18:33:09 debug: runner is handling updates slowly, discarded previous update for qr1hi-dz642-itcctbtq2z1lb8t
2017-04-25_18:33:53.11340 2017/04/25 18:33:53 debug: runner is handling updates slowly, discarded previous update for qr1hi-dz642-itcctbtq2z1lb8t
2017-04-25_18:34:34.72911 2017/04/25 18:34:34 debug: runner is handling updates slowly, discarded previous update for qr1hi-dz642-itcctbtq2z1lb8t
2017-04-25_18:34:49.55992 2017/04/25 18:34:49 Error submitting container qr1hi-dz642-itcctbtq2z1lb8t to slurm: sbatch ["sbatch" "--share" "--workdir=/tmp" "--job-name=qr1hi-dz642-itcctbtq2z1lb8t" "--mem-per-cpu=3750" "--cpus-per-task=16"] failed: exit status 1 (stderr: "sbatch: error: Batch job submission failed: Requested node configuration is not available\n")
2017-04-25_18:35:18.63901 2017/04/25 18:35:18 debug: runner is handling updates slowly, discarded previous update for qr1hi-dz642-itcctbtq2z1lb8t
2017-04-25_18:39:03.88534 2017/04/25 18:39:03 Error running "/usr/bin/squeue" ["squeue" "--all" "--format=%j"]: fork/exec /usr/bin/squeue: cannot allocate memory "" 
2017-04-25_18:39:04.88547 2017/04/25 18:39:04 Error running "/usr/bin/squeue" ["squeue" "--all" "--format=%j"]: fork/exec /usr/bin/squeue: cannot allocate memory "" 
2017-04-25_18:39:18.88524 2017/04/25 18:39:18 Error running "/usr/bin/squeue" ["squeue" "--all" "--format=%j"]: fork/exec /usr/bin/squeue: cannot allocate memory "" 
2017-04-25_18:39:19.88518 2017/04/25 18:39:19 Error running "/usr/bin/squeue" ["squeue" "--all" "--format=%j"]: fork/exec /usr/bin/squeue: cannot allocate memory "" 

#4 Updated by Ward Vandewege over 1 year ago

Stopped all runit services. Restarted nginx. Restarted all runit services minus crunch-dispatch-slurm.

A few hours later, workbench still timing out on the dashboard page. Other pages work fine. See also bug #11548.

Looks like we have some sort of serious memory leak in the API server now:

qr1hi:/var/www/arvados-api/shared/log# passenger-status 
Version : 5.1.1
Date    : 2017-04-26 00:03:23 +0000
Instance: pnzh97Yf (nginx/1.10.2 Phusion_Passenger/5.1.1)

----------- General information -----------
Max pool size : 128
App groups    : 1
Processes     : 10
Requests in top-level queue : 0

----------- Application groups -----------
/var/www/qr1hi.arvadosapi.com/current/public:
  App root: /var/www/qr1hi.arvadosapi.com/current
  Requests in queue: 0
  * PID: 18986   Sessions: 0       Processed: 2181    Uptime: 1h 57m 21s
    CPU: 47%     Memory  : 10278M   Last used: 2s ago
  * PID: 18995   Sessions: 0       Processed: 3694    Uptime: 1h 57m 21s
    CPU: 10%     Memory  : 260M    Last used: 14s ago
  * PID: 19445   Sessions: 0       Processed: 993     Uptime: 1h 56m 37s
    CPU: 3%      Memory  : 241M    Last used: 1m 18s a
  * PID: 48077   Sessions: 0       Processed: 534     Uptime: 1h 28m 17s
    CPU: 1%      Memory  : 221M    Last used: 1m 21s a
  * PID: 48089   Sessions: 0       Processed: 151     Uptime: 1h 28m 16s
    CPU: 0%      Memory  : 157M    Last used: 1m 21s a
  * PID: 48097   Sessions: 0       Processed: 47      Uptime: 1h 28m 16s
    CPU: 0%      Memory  : 195M    Last used: 5m 21s a
  * PID: 48639   Sessions: 0       Processed: 29      Uptime: 1h 27m 51s
    CPU: 0%      Memory  : 62M     Last used: 13m 21s 
  * PID: 49052   Sessions: 0       Processed: 6       Uptime: 1h 27m 21s
    CPU: 0%      Memory  : 45M     Last used: 19m 21s 
  * PID: 49073   Sessions: 0       Processed: 2       Uptime: 1h 27m 20s
    CPU: 0%      Memory  : 41M     Last used: 19m 21s 
  * PID: 50689   Sessions: 0       Processed: 0       Uptime: 19m 16s
    CPU: 0%      Memory  : 77M     Last used: 19m 16s ago

qr1hi:/var/www/arvados-api/shared/log# free
             total       used       free     shared    buffers     cached
Mem:      28811760   15024744   13787016     156080      46392     994344
-/+ buffers/cache:   13984008   14827752
Swap:            0          0          0

Looking further, perhaps this has to do with container requests. Postgres slow query log:

2017-04-26 00:31:52 UTC LOG:  duration: 10064.070 ms  execute <unnamed>: SELECT  "container_requests".* FROM "container_requests"  WHERE (container_requests.requesting_container_uuid='qr1hi-dz642-khl9hd88hk88a7m') ORDER BY container_requests.modified_at desc, container_requests.uuid LIMIT 1000 OFFSET 0

Okay. Not that slow, but still slower than it should be. Explain analyze.... huh that looks ok:

arvados_production=# explain analyze SELECT container_requests.* FROM container_requests  WHERE (container_requests.requesting_container_uuid='qr1hi-dz642-khl9hd88hk88a7m') ORDER BY container_requests.modified_at desc, container_requests.uuid LIMIT 1000 OFFSET 0;
                                                          QUERY PLAN                                                          
------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=23.13..23.16 rows=12 width=1507) (actual time=0.207..0.245 rows=15 loops=1)
   ->  Sort  (cost=23.13..23.16 rows=12 width=1507) (actual time=0.168..0.184 rows=15 loops=1)
         Sort Key: modified_at, uuid
         Sort Method: quicksort  Memory: 40kB
         ->  Seq Scan on container_requests  (cost=0.00..22.91 rows=12 width=1507) (actual time=0.090..0.130 rows=15 loops=1)
               Filter: ((requesting_container_uuid)::text = 'qr1hi-dz642-khl9hd88hk88a7m'::text)
               Rows Removed by Filter: 141
 Total runtime: 0.313 ms
(8 rows)
arvados_production=# select count(id) from container_requests;
 count 
-------
   156
(1 row)

How can this possibly be so slow in the production logs? Let's poke at one:

arvados_production=# SELECT container_requests.* FROM container_requests  WHERE (container_requests.requesting_container_uuid='qr1hi-dz642-khl9hd88hk88a7m') ORDER BY container_requests.modified_at desc, container_requests.uuid limit 1 \g /tmp/sample_cr.sql

# ls -laFh /tmp/sample_cr.sql
-rw-rw-r-- 1 postgres postgres 217M Apr 26 00:40 /tmp/sample_cr.sql

What the... No wonder we have sluggish Postgres queries. No wonder we see Ruby memory use explosions.

arvados_production=# select id,length(description) as description, length(properties) as properties, length(mounts) as mounts,length(runtime_constraints) as runtime_constraints, length(environment) as environment, length(command) as command, length(filters) as filters, length(scheduling_parameters) as scheduling_parameters from container_requests order by id \g /tmp/lengths

$ cat /tmp/lengths
 id  | description | properties |  mounts  | runtime_constraints | environment | command | filters | scheduling_parameters
-----+-------------+------------+----------+---------------------+-------------+---------+---------+-----------------------
   1 |             |          7 |      838 |                  57 |          42 |     425 |         |                     7
   2 |             |          7 |    68955 |                  65 |           7 |     193 |         |                     7
   3 |             |          2 |      805 |                  55 |          41 |     319 |         |                     2
   4 |             |          2 |      805 |                  55 |          41 |     319 |         |                     2
   5 |             |          2 |      805 |                  55 |          41 |     319 |         |                     2
   6 |             |          2 |      745 |                  55 |          41 |     307 |         |                     2
   7 |             |          2 |      341 |                  55 |          41 |     312 |         |                     2
   8 |             |          2 |      337 |                  55 |          41 |     122 |         |                     2
   9 |             |          2 |      337 |                  55 |          41 |     127 |         |                     2
  10 |             |          2 |      310 |                  55 |          41 |      17 |         |                     2
  11 |             |          2 |      247 |                  55 |          41 |      61 |         |                     2
  12 |             |          2 |      273 |                  55 |          41 |      61 |         |                     2
  13 |             |          2 |      244 |                  55 |          41 |      61 |         |                     2
  14 |             |          2 |      563 |                  66 |           2 |     176 |         |                     2
  15 |             |          2 |      554 |                  66 |           2 |     176 |         |                     2
  16 |             |          2 |      556 |                  66 |           2 |     176 |         |                     2
...
 120 |             |          2 |     1314 |                  55 |          41 |     100 |         |                     2
 121 |             |          2 |     1107 |                  55 |          41 |      43 |         |                     2
 122 |             |          2 |      301 |                  55 |          41 |      88 |         |                     2
 123 |             |          2 |      318 |                  55 |          41 |      22 |         |                     2
 124 |             |          2 |      246 |                  55 |          41 |      35 |         |                     2
 125 |             |          2 |      774 |                  55 |          41 |      90 |         |                     2
 126 |             |          2 |       91 |                  55 |          41 |      12 |         |                     2
 127 |             |          2 |       33 |                  55 |          41 |     128 |         |                     2
 128 |             |          2 |       33 |                  55 |          41 |     128 |         |                     2
 129 |             |          2 |      874 |                  66 |           2 |     176 |         |                     2
 130 |             |          2 |       95 |                  55 |          41 |      12 |         |                     2
 131 |             |          2 |      314 |                  55 |          41 |      25 |         |                     2
 132 |             |          2 |     1249 |                  55 |          41 |     169 |         |                     2
 133 |             |          2 |      829 |                  66 |           2 |     176 |         |                     2
 134 |             |          2 |      550 |                  66 |           2 |     176 |         |                     2
 135 |             |          2 |       86 |                  55 |          69 |      35 |         |                     2
 136 |             |          2 |      869 |                  66 |           2 |     176 |         |                     2
 137 |             |         22 |      556 |                  66 |           2 |     176 |         |                     2
 138 |             |          2 |    88634 |                  39 |           2 |     176 |         |                     2
 139 |             |          2 |    88601 |                  39 |           2 |     176 |         |                     2
 140 |             |          2 |    83766 |                  39 |           2 |     176 |         |                     2
 141 |             |          2 |    83766 |                  39 |           2 |     176 |         |                     2
 142 |             |          2 | 75637943 |                  59 |          41 |     256 |         |                     2
 143 |             |          2 | 75637943 |                  59 |          41 |     256 |         |                     2
 144 |             |          2 | 75637943 |                  59 |          41 |     256 |         |                     2
 145 |             |          2 | 75637943 |                  59 |          41 |     256 |         |                     2
 146 |             |          2 | 75637943 |                  59 |          41 |     256 |         |                     2
 147 |             |          2 | 75637943 |                  59 |          41 |     256 |         |                     2
 148 |             |          2 | 75637943 |                  59 |          41 |     256 |         |                     2
 149 |             |          2 | 75637943 |                  59 |          41 |     256 |         |                     2
 150 |             |          2 | 75637943 |                  59 |          41 |     256 |         |                     2
 151 |             |          2 | 75637943 |                  59 |          41 |     256 |         |                     2
 154 |             |          2 | 75637943 |                  59 |          41 |     256 |         |                     2
 155 |             |          2 | 75637943 |                  59 |          41 |     256 |         |                     2
 156 |             |          2 | 75637943 |                  59 |          41 |     256 |         |                     2
 157 |             |          2 | 75637943 |                  59 |          41 |     256 |         |                     2
 158 |             |          2 | 75637943 |                  59 |          41 |     256 |         |                     2
(156 rows)

#5 Updated by Ward Vandewege over 1 year ago

I deleted the offending container requests. Arv threw lots of errors at me but eventually did the job, and now things are A-OK again. Workbench fully functional.

wardv@shell.qr1hi:~$ cat del-cr.sh 
arv container_request delete --uuid qr1hi-xvhdp-amnlx2o1ij72384 
arv container_request delete --uuid qr1hi-xvhdp-0v9iioehkktegq7
arv container_request delete --uuid qr1hi-xvhdp-gdoro474x6mlzf0
arv container_request delete --uuid qr1hi-xvhdp-ggar6xs8wdq6z5f
arv container_request delete --uuid qr1hi-xvhdp-s77gna8fborksau
arv container_request delete --uuid qr1hi-xvhdp-d6e0tw34awdey0n
arv container_request delete --uuid qr1hi-xvhdp-mx9r8k3kqlnputi
arv container_request delete --uuid qr1hi-xvhdp-tddgn6gtbnacvtu
arv container_request delete --uuid qr1hi-xvhdp-uogsswwgfkk03b8
arv container_request delete --uuid qr1hi-xvhdp-sp0tiax8g97zssv
arv container_request delete --uuid qr1hi-xvhdp-05cdb5pu829hjki
arv container_request delete --uuid qr1hi-xvhdp-5x4ub0qf9plzira
arv container_request delete --uuid qr1hi-xvhdp-7h7093oksr65sum
arv container_request delete --uuid qr1hi-xvhdp-gfzayezhzrrj45l
arv container_request delete --uuid qr1hi-xvhdp-99farw9kcikc38s
wardv@shell.qr1hi:~$ sh del-cr.sh 
Error: Path not found
Error: Path not found
Error: Path not found
/usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:158:in `rbuf_fill': Net::ReadTimeout (Faraday::TimeoutError)
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:136:in `readuntil'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:146:in `readline'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http/response.rb:40:in `read_status_line'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http/response.rb:29:in `read_new'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1437:in `block in transport_request'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1434:in `catch'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1434:in `transport_request'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1407:in `request'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1400:in `block in request'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:853:in `start'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1398:in `request'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:80:in `perform_request'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:38:in `block in call'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:85:in `with_net_http_connection'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:33:in `call'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/response.rb:8:in `call'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client/request.rb:163:in `send'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:648:in `block (2 levels) in execute!'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable/retry.rb:27:in `perform'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable.rb:15:in `retriable'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:645:in `block in execute!'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable/retry.rb:27:in `perform'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable.rb:15:in `retriable'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:636:in `execute!'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:679:in `execute'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/arvados-cli-0.1.20170417140106/bin/arv:71:in `execute'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/arvados-cli-0.1.20170417140106/bin/arv:709:in `<top (required)>'
    from /usr/local/rvm/gems/ruby-2.3.3/bin/arv:22:in `load'
    from /usr/local/rvm/gems/ruby-2.3.3/bin/arv:22:in `<main>'
    from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in `eval'
    from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in `<main>'

/usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:158:in `rbuf_fill': Net::ReadTimeout (Faraday::TimeoutError)
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:136:in `readuntil'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:146:in `readline'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http/response.rb:40:in `read_status_line'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http/response.rb:29:in `read_new'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1437:in `block in transport_request'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1434:in `catch'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1434:in `transport_request'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1407:in `request'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1400:in `block in request'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:853:in `start'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1398:in `request'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:80:in `perform_request'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:38:in `block in call'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:85:in `with_net_http_connection'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:33:in `call'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/response.rb:8:in `call'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client/request.rb:163:in `send'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:648:in `block (2 levels) in execute!'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable/retry.rb:27:in `perform'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable.rb:15:in `retriable'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:645:in `block in execute!'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable/retry.rb:27:in `perform'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable.rb:15:in `retriable'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:636:in `execute!'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:679:in `execute'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/arvados-cli-0.1.20170417140106/bin/arv:71:in `execute'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/arvados-cli-0.1.20170417140106/bin/arv:709:in `<top (required)>'
    from /usr/local/rvm/gems/ruby-2.3.3/bin/arv:22:in `load'
    from /usr/local/rvm/gems/ruby-2.3.3/bin/arv:22:in `<main>'
    from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in `eval'
    from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in `<main>'
Error: Path not found

Error: Path not found

/usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:158:in `rbuf_fill': Net::ReadTimeout (Faraday::TimeoutError)
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:136:in `readuntil'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:146:in `readline'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http/response.rb:40:in `read_status_line'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http/response.rb:29:in `read_new'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1437:in `block in transport_request'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1434:in `catch'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1434:in `transport_request'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1407:in `request'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1400:in `block in request'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:853:in `start'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1398:in `request'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:80:in `perform_request'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:38:in `block in call'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:85:in `with_net_http_connection'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:33:in `call'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/response.rb:8:in `call'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client/request.rb:163:in `send'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:648:in `block (2 levels) in execute!'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable/retry.rb:27:in `perform'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable.rb:15:in `retriable'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:645:in `block in execute!'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable/retry.rb:27:in `perform'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable.rb:15:in `retriable'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:636:in `execute!'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:679:in `execute'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/arvados-cli-0.1.20170417140106/bin/arv:71:in `execute'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/arvados-cli-0.1.20170417140106/bin/arv:709:in `<top (required)>'
    from /usr/local/rvm/gems/ruby-2.3.3/bin/arv:22:in `load'
    from /usr/local/rvm/gems/ruby-2.3.3/bin/arv:22:in `<main>'
    from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in `eval'
    from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in `<main>'

/usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:158:in `rbuf_fill': Net::ReadTimeout (Faraday::TimeoutError)
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:136:in `readuntil'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:146:in `readline'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http/response.rb:40:in `read_status_line'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http/response.rb:29:in `read_new'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1437:in `block in transport_request'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1434:in `catch'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1434:in `transport_request'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1407:in `request'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1400:in `block in request'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:853:in `start'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1398:in `request'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:80:in `perform_request'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:38:in `block in call'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:85:in `with_net_http_connection'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:33:in `call'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/response.rb:8:in `call'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client/request.rb:163:in `send'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:648:in `block (2 levels) in execute!'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable/retry.rb:27:in `perform'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable.rb:15:in `retriable'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:645:in `block in execute!'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable/retry.rb:27:in `perform'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable.rb:15:in `retriable'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:636:in `execute!'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:679:in `execute'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/arvados-cli-0.1.20170417140106/bin/arv:71:in `execute'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/arvados-cli-0.1.20170417140106/bin/arv:709:in `<top (required)>'
    from /usr/local/rvm/gems/ruby-2.3.3/bin/arv:22:in `load'
    from /usr/local/rvm/gems/ruby-2.3.3/bin/arv:22:in `<main>'
    from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in `eval'
    from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in `<main>'
Error: Path not found

/usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:158:in `rbuf_fill': Net::ReadTimeout (Faraday::TimeoutError)
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:136:in `readuntil'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:146:in `readline'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http/response.rb:40:in `read_status_line'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http/response.rb:29:in `read_new'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1437:in `block in transport_request'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1434:in `catch'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1434:in `transport_request'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1407:in `request'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1400:in `block in request'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:853:in `start'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1398:in `request'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:80:in `perform_request'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:38:in `block in call'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:85:in `with_net_http_connection'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:33:in `call'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/response.rb:8:in `call'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client/request.rb:163:in `send'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:648:in `block (2 levels) in execute!'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable/retry.rb:27:in `perform'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable.rb:15:in `retriable'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:645:in `block in execute!'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable/retry.rb:27:in `perform'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable.rb:15:in `retriable'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:636:in `execute!'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:679:in `execute'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/arvados-cli-0.1.20170417140106/bin/arv:71:in `execute'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/arvados-cli-0.1.20170417140106/bin/arv:709:in `<top (required)>'
    from /usr/local/rvm/gems/ruby-2.3.3/bin/arv:22:in `load'
    from /usr/local/rvm/gems/ruby-2.3.3/bin/arv:22:in `<main>'
    from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in `eval'
    from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in `<main>'
/usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:158:in `rbuf_fill': Net::ReadTimeout (Faraday::TimeoutError)
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:136:in `readuntil'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:146:in `readline'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http/response.rb:40:in `read_status_line'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http/response.rb:29:in `read_new'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1437:in `block in transport_request'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1434:in `catch'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1434:in `transport_request'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1407:in `request'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1400:in `block in request'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:853:in `start'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1398:in `request'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:80:in `perform_request'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:38:in `block in call'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:85:in `with_net_http_connection'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:33:in `call'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/response.rb:8:in `call'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client/request.rb:163:in `send'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:648:in `block (2 levels) in execute!'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable/retry.rb:27:in `perform'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable.rb:15:in `retriable'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:645:in `block in execute!'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable/retry.rb:27:in `perform'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable.rb:15:in `retriable'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:636:in `execute!'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:679:in `execute'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/arvados-cli-0.1.20170417140106/bin/arv:71:in `execute'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/arvados-cli-0.1.20170417140106/bin/arv:709:in `<top (required)>'
    from /usr/local/rvm/gems/ruby-2.3.3/bin/arv:22:in `load'
    from /usr/local/rvm/gems/ruby-2.3.3/bin/arv:22:in `<main>'
    from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in `eval'
    from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in `<main>'
/usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:158:in `rbuf_fill': Net::ReadTimeout (Faraday::TimeoutError)
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:136:in `readuntil'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:146:in `readline'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http/response.rb:40:in `read_status_line'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http/response.rb:29:in `read_new'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1437:in `block in transport_request'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1434:in `catch'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1434:in `transport_request'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1407:in `request'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1400:in `block in request'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:853:in `start'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1398:in `request'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:80:in `perform_request'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:38:in `block in call'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:85:in `with_net_http_connection'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:33:in `call'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/response.rb:8:in `call'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client/request.rb:163:in `send'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:648:in `block (2 levels) in execute!'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable/retry.rb:27:in `perform'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable.rb:15:in `retriable'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:645:in `block in execute!'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable/retry.rb:27:in `perform'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable.rb:15:in `retriable'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:636:in `execute!'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:679:in `execute'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/arvados-cli-0.1.20170417140106/bin/arv:71:in `execute'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/arvados-cli-0.1.20170417140106/bin/arv:709:in `<top (required)>'
    from /usr/local/rvm/gems/ruby-2.3.3/bin/arv:22:in `load'
    from /usr/local/rvm/gems/ruby-2.3.3/bin/arv:22:in `<main>'
    from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in `eval'
    from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in `<main>'
/usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:158:in `rbuf_fill': Net::ReadTimeout (Faraday::TimeoutError)
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:136:in `readuntil'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:146:in `readline'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http/response.rb:40:in `read_status_line'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http/response.rb:29:in `read_new'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1437:in `block in transport_request'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1434:in `catch'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1434:in `transport_request'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1407:in `request'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1400:in `block in request'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:853:in `start'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1398:in `request'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:80:in `perform_request'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:38:in `block in call'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:85:in `with_net_http_connection'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:33:in `call'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/response.rb:8:in `call'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client/request.rb:163:in `send'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:648:in `block (2 levels) in execute!'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable/retry.rb:27:in `perform'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable.rb:15:in `retriable'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:645:in `block in execute!'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable/retry.rb:27:in `perform'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable.rb:15:in `retriable'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:636:in `execute!'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:679:in `execute'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/arvados-cli-0.1.20170417140106/bin/arv:71:in `execute'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/arvados-cli-0.1.20170417140106/bin/arv:709:in `<top (required)>'
    from /usr/local/rvm/gems/ruby-2.3.3/bin/arv:22:in `load'
    from /usr/local/rvm/gems/ruby-2.3.3/bin/arv:22:in `<main>'
    from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in `eval'
    from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in `<main>'
/usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:158:in `rbuf_fill': Net::ReadTimeout (Faraday::TimeoutError)
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:136:in `readuntil'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:146:in `readline'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http/response.rb:40:in `read_status_line'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http/response.rb:29:in `read_new'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1437:in `block in transport_request'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1434:in `catch'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1434:in `transport_request'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1407:in `request'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1400:in `block in request'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:853:in `start'
    from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1398:in `request'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:80:in `perform_request'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:38:in `block in call'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:85:in `with_net_http_connection'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:33:in `call'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/response.rb:8:in `call'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client/request.rb:163:in `send'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:648:in `block (2 levels) in execute!'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable/retry.rb:27:in `perform'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable.rb:15:in `retriable'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:645:in `block in execute!'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable/retry.rb:27:in `perform'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable.rb:15:in `retriable'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:636:in `execute!'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:679:in `execute'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/arvados-cli-0.1.20170417140106/bin/arv:71:in `execute'
    from /usr/local/rvm/gems/ruby-2.3.3/gems/arvados-cli-0.1.20170417140106/bin/arv:709:in `<top (required)>'
    from /usr/local/rvm/gems/ruby-2.3.3/bin/arv:22:in `load'
    from /usr/local/rvm/gems/ruby-2.3.3/bin/arv:22:in `<main>'
    from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in `eval'
    from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in `<main>'

#6 Updated by Peter Amstutz over 1 year ago

  • Subject changed from crunch-dispatch-slurm using huge ram to [API] Reject huge serialized fields
  • Description updated (diff)

#7 Updated by Peter Amstutz over 1 year ago

  • Description updated (diff)

#8 Updated by Peter Amstutz over 1 year ago

  • Description updated (diff)

#9 Updated by Tom Clegg over 1 year ago

AFAIK apiserver regularly handles >75 MB serialized "properties" fields in log records, so I'm not convinced that the size of the serialized data is the real problem here.

#10 Updated by Peter Amstutz over 1 year ago

Suggested test case to avoid container-specific issues:

Try creating a collection record with a complex "properties" field (e.g. 100,000 dict entries which are themselves objects) and see how that affects the request profile.

#11 Updated by Tom Clegg over 1 year ago

Peter Amstutz wrote:

Try creating a collection record with a complex "properties" field (e.g. 100,000 dict entries which are themselves objects) and see how that affects the request profile.

Perhaps add to source:sdk/python/tests/test_benchmark_collections.py

#12 Updated by Tom Clegg over 1 year ago

  • Subject changed from [API] Reject huge serialized fields to [API] Excessive memory and time consumed by API and dispatch when container fields (like "mounts") are large

#13 Updated by Tom Clegg over 1 year ago

  • Description updated (diff)

#14 Updated by Peter Amstutz over 1 year ago

First time is time to create record, second time is time to fetch record.

Properties consisting of one 10 MB string:

$ python huge.py 
payload about 10088844
14.6757330894
34t0i-7a9it-ko3a9mc8w2k7xrq
1.81077194214

Properties consisting of about 8 MB of JSON, as a dict with 100,000 objects:

$ python huge.py 
payload about 8088854
54.548582077
34t0i-7a9it-b6t3vd2x5xqi8gc
19.4000740051

So there's about a 10x performance difference between a single huge string, and a complex serialized record. (Likely also a memory usage difference on API server, but I didn't benchmarked that). This is consistent with the performance issues seen with the huge "mounts" entry.

#15 Updated by Tom Clegg over 1 year ago

Indeed. Given that result, I don't think a column size limit seems like a good strategy for avoiding slowness.

Seems like the priorities should be
  1. Fix #11546 so lock/unlock don't block other queries happening at the same time
  2. Avoid sending the mounts field (and other fields not used by dispatch) in lock/unlock responses (perhaps using the "select" parameter)
  3. Avoid loading the mounts field (and other fields not used by dispatch) when dispatch uses get/index APIs

With those changes, we'd just have the unavoidable costs of the client submitting the large mounts hash and crunch-run retrieving it on the worker node, instead of an explosion of unnecessary serialize/transfer/deserialize sequences.

#16 Updated by Peter Amstutz over 1 year ago

Instead of setting a column size limit, we can scan the JSON serialization to get an object count:

def count_js_objects n
  objcount = 0
  instr = false
  skip = false

  n.each_char do |c|
    if skip
      skip = false
    else
      case c
      when "{", "[" 
        objcount += 1
      when '\\'
        skip = true
      when '"'
        if instr
          objcount += 1
        end
        instr = !instr
      end
    end
  end
  objcount
end

#17 Updated by Tom Morris over 1 year ago

  • Target version set to Arvados Future Sprints

#18 Updated by Joshua Randall 6 months ago

It looks like we are currently suffering this very same issue. Some of our container requests are very large, and our api server instance recently started crashing repeatedly due to memory exhaustion.

We initially tried lowering the number of workers, but there seems to be no upper limit to the amount of memory a single worker can take, as they just get larger and larger until the OOM is invoked.

Our workaround is to run a cron job once per minute that kills "fat passengers":

root@arvados-master-eglyx:/# cat /etc/cron.d/arvados_api_fat_passenger_killer
* * * * * root /var/www/arvados-api/current/script/arvados_api_fat_passenger_killer.sh
root@arvados-master-eglyx:/# cat /var/www/arvados-api/current/script/arvados_api_fat_passenger_killer.sh
#!/bin/bash

export PASSENGER_INSTANCE_REGISTRY_DIR=/var/run/nginx/passenger

date=$(date)
echo "arvados_api_fat_passenger_killer running at ${date}" >> /var/log/arvados_api_fat_passenger_killer.log
fat_passenger_pids=$(/usr/sbin/passenger-memory-stats | grep 'Passenger AppPreloader: /var/www/arvados-api/current' | grep -v grep | awk '$4>5000 {print $4, $1}' | sort -rg | awk '{print $2}') 2> /dev/null
for fat_passenger_pid in ${fat_passenger_pids}; do
    echo "killing ${fat_passenger_pid}" 
    kill -SIGUSR1 ${fat_passenger_pid}
done >> /var/log/arvados_api_fat_passenger_killer.log

#19 Updated by Peter Amstutz 6 months ago

Are the container requests large because they have a very large workflow, or because they have a very large number of keep mounts?

#20 Updated by Joshua Randall 6 months ago

We have both kinds. For example, our a-c-r runner containers have large workflow inputs that is mostly in cwl.input.json and can take ~45s to retrieve:

mercury@arvados-shell-eglyx:~$ ls -l eglyx-xvhdp-d52pwdy7flpucnm.json
-rw-r--r-- 1 mercury hgi 21758914 Jun 12 13:56 eglyx-xvhdp-d52pwdy7flpucnm.json
mercury@arvados-shell-eglyx:~$ for mkey in $(cat eglyx-xvhdp-d52pwdy7flpucnm.json | jq -r '.mounts | keys | join(" ")'); do echo ${mkey}; (cat eglyx-xvhdp-d52pwdy7flpucnm.json | jq '.mounts."'${mkey}'"' | wc); done
/var/lib/cwl/cwl.input.json
 410322  673941 23385057
/var/lib/cwl/workflow.json
   3130    7849  113053
/var/spool/cwl
      4       6      47
stdout
      4       6      65

But others have a (very) large number of mounts and I cannot get you an example of one because we can rarely if ever receive them from the API server before it either times out or crashes due to memory exhaustion.

#21 Updated by Joshua Randall 6 months ago

N.B. we have updated our fat passenger killer after discovering that advice regarding doing passenger instance graceful restarts with SIGUSR1 is out of date. As of passenger 5, that no longer causes a graceful restart, and the advice appears to be to use `passenger-config detach-process` instead. This change has resolved some bad gateway errors we were seeing with the previous approach:

# cat /var/www/arvados-api/current/script/arvados_api_fat_passenger_killer.sh
#!/bin/bash

export PASSENGER_INSTANCE_REGISTRY_DIR=/var/run/nginx/passenger

date=$(date)
echo "arvados_api_fat_passenger_killer running at ${date}" >> /var/log/arvados_api_fat_passenger_killer.log
fat_passenger_pids=$(/usr/sbin/passenger-memory-stats | grep 'Passenger AppPreloader: /var/www/arvados-api/current' | grep -v grep | awk '$4>5000 {print $4, $1}' | sort -rg | awk '{print $2}') 2> /dev/null
for fat_passenger_pid in ${fat_passenger_pids}; do
    echo "detaching passenger process ${fat_passenger_pid}" 
    passenger-config detach-process ${fat_passenger_pid}
done >> /var/log/arvados_api_fat_passenger_killer.log

#22 Updated by Joshua Randall 6 months ago

Another example.

From passenger.log - I believe this was crunch attempting to update the container to say it has failed:

{"method":"PUT","path":"/arvados/v1/containers/eglyx-dz642-jrft0qq1n81158b","format":"html","controller":"Arvados::V1::ContainersController","action":"update","status":200,"duration":9006160.12,"view":0.44,"db":5749338.9,"request_id":"req-d8ly8c41pp35sa80flpc","client_ipaddr":"172.27.16.250","client_auth":"eglyx-gj3su-xo6zhx218uqho7j","params":{"container":"{\"exit_code\":1,\"log\":\"ec8db9aa40ff0b393bcdb9e65cd144a0+433\",\"output\":\"5551f5d7ec57a8aa9758d14a976e87e9+57\",\"state\":\"Complete\"}"},"@timestamp":"2018-06-13T23:17:35.509049624Z","@version":"1","message":"[200] PUT /arvados/v1/containers/eglyx-dz642-jrft0qq1n81158b (Arvados::V1::ContainersController#update)"}

If I am reading that correctly, that appears to be 95m querying the database and an additional 54m processing within the API server.

arvados_api_production=# select uuid, char_length(mounts) from containers where uuid = 'eglyx-dz642-jrft0qq1n81158b';
            uuid             | char_length
-----------------------------+-------------
 eglyx-dz642-jrft0qq1n81158b |   108682927
(1 row)

This particular one appears to be another runner container (so, with only 4 entries in mounts). The size of the entry is down to a 108.6MB cwl.input.json

#23 Updated by Tom Clegg 6 months ago

There are several queries in container/container_request code that retrieve the entire object but only need a few fields like priority, state, and requesting_container_uuid. Adding select() to a few places might go a long way.

#24 Updated by Tom Clegg 6 months ago

ArvadosModel#log_start_state spends significant time (and surely memory) doing Marshal.load(Marshal.dump(attributes)) + Marshal.load(Marshal.dump(logged_attributes)) before each update, which is especially wasteful in transactions that don't end up modifying anything (e.g., update fails) and on systems where audit logs are not needed.

KindAndEtag#etag calls attributes.inspect, which seems to be much slower than other alternatives like Marshal.dump(attributes) and Oj.dump(attributes). (One data point: changing this to Marshal.dump() reduced collections#get(9tee4-4zz18-fbfj9efjpq282cz) request duration from 6s to 5s.)

#25 Updated by Peter Amstutz 5 months ago

We recently found a fixed an issue where the version of the Oj (json marshaller) gem we were using had a bug that resulted in excessive memory usage. Perhaps is a root cause of this issue as well?

Also available in: Atom PDF