Bug #11547
open[API] Excessive memory and time consumed by API and dispatch when container fields (like "mounts") are large
Description
arvados-cwl-runner created a container request that was about 75 MB of JSON. This caused huge memory bloat in the API server, paralyzed requests to both the containers and container_requests table (details on #11546), and resulted in memory bloat of other services as well, such as crunch-dispatch-slurm:
$ ps axv | grep crunch-dispatch-slurm PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND 43366 ? Sl 34:03 2248 0 3676584 2918600 10.1 /usr/bin/crunch-dispatch-slurm
crunch-dispatch-slurm does not use the "mounts" field, so it should not load it (or any other potentially large field it doesn't use).
The API server should use a more efficient procedure to achieve atomic "lock" and "unlock" (currently this process involves an unnecessary ActiveRecord load/save cycle).
If the API is still very slow after these changes, determine what makes it slow, and add some limits with suitable defaults.
In any case, arvados-cwl-runner should not produce huge container requests.
Files
Related issues
Updated by Peter Amstutz over 7 years ago
Updated by Ward Vandewege over 7 years ago
Stopping and starting:
2017-04-25_20:41:22.96814 2017/04/25 20:41:22 Error running "/usr/bin/squeue" ["squeue" "--all" "--format=%j"]: fork/exec /usr/bin/squeue: cannot allocate memory "" 2017-04-25_20:41:23.94909 2017/04/25 20:41:23 Error running "/usr/bin/squeue" ["squeue" "--all" "--format=%j"]: fork/exec /usr/bin/squeue: cannot allocate memory "" 2017-04-25_20:41:28.64481 2017/04/25 20:41:28 Error getting list of containers: "unexpected EOF" 2017-04-25_20:45:31.94756 2017/04/25 20:45:31 Error running "/usr/bin/squeue" ["squeue" "--all" "--format=%j"]: fork/exec /usr/bin/squeue: cannot allocate memory "" 2017-04-25_20:45:33.94237 2017/04/25 20:45:33 Error running "/usr/bin/squeue" ["squeue" "--all" "--format=%j"]: fork/exec /usr/bin/squeue: cannot allocate memory "" 2017-04-25_20:45:34.94765 2017/04/25 20:45:34 Error running "/usr/bin/squeue" ["squeue" "--all" "--format=%j"]: fork/exec /usr/bin/squeue: cannot allocate memory "" 2017-04-25_20:45:37.96405 2017/04/25 20:45:37 Error running "/usr/bin/squeue" ["squeue" "--all" "--format=%j"]: fork/exec /usr/bin/squeue: cannot allocate memory "" 2017-04-25_20:45:40.97709 2017/04/25 20:45:40 Submitting container qr1hi-dz642-itcctbtq2z1lb8t to slurm 2017-04-25_20:45:40.97732 2017/04/25 20:45:40 exec sbatch ["sbatch" "--share" "--workdir=/tmp" "--job-name=qr1hi-dz642-itcctbtq2z1lb8t" "--mem-per-cpu=3750" "--cpus-per-task=16"] 2017-04-25_20:46:09.03406 Stopping crunch-dispatch-slurm 2017-04-25_20:47:37.27458 Starting crunch-dispatch-slurm from /etc/sv/crunch-dispatch-slurm 2017-04-25_20:47:38.40428 2017/04/25 20:47:38 Start monitoring container qr1hi-dz642-khl9hd88hk88a7m
$ date; ps axv | grep crunch-dispatch-slurm Tue Apr 25 20:48:36 UTC 2017 51812 ? Sl 0:03 0 2362 860705 315600 1.0 /usr/bin/crunch-dispatch-slurm
Updated by Ward Vandewege over 7 years ago
Looks like things started going haywire around here:
2017-04-25_18:14:20.89864 2017/04/25 18:14:20 sbatch succeeded: "Submitted batch job 21300" 2017-04-25_18:14:20.89867 2017/04/25 18:14:20 Start monitoring container qr1hi-dz642-khl9hd88hk88a7m 2017-04-25_18:32:22.91051 2017/04/25 18:32:22 Submitting container qr1hi-dz642-itcctbtq2z1lb8t to slurm 2017-04-25_18:32:22.91082 2017/04/25 18:32:22 exec sbatch ["sbatch" "--share" "--workdir=/tmp" "--job-name=qr1hi-dz642-itcctbtq2z1lb8t" "--mem-per-cpu=3750" "--cpus-per-task=16"] 2017-04-25_18:33:09.22483 2017/04/25 18:33:09 debug: runner is handling updates slowly, discarded previous update for qr1hi-dz642-itcctbtq2z1lb8t 2017-04-25_18:33:53.11340 2017/04/25 18:33:53 debug: runner is handling updates slowly, discarded previous update for qr1hi-dz642-itcctbtq2z1lb8t 2017-04-25_18:34:34.72911 2017/04/25 18:34:34 debug: runner is handling updates slowly, discarded previous update for qr1hi-dz642-itcctbtq2z1lb8t 2017-04-25_18:34:49.55992 2017/04/25 18:34:49 Error submitting container qr1hi-dz642-itcctbtq2z1lb8t to slurm: sbatch ["sbatch" "--share" "--workdir=/tmp" "--job-name=qr1hi-dz642-itcctbtq2z1lb8t" "--mem-per-cpu=3750" "--cpus-per-task=16"] failed: exit status 1 (stderr: "sbatch: error: Batch job submission failed: Requested node configuration is not available\n") 2017-04-25_18:35:18.63901 2017/04/25 18:35:18 debug: runner is handling updates slowly, discarded previous update for qr1hi-dz642-itcctbtq2z1lb8t 2017-04-25_18:39:03.88534 2017/04/25 18:39:03 Error running "/usr/bin/squeue" ["squeue" "--all" "--format=%j"]: fork/exec /usr/bin/squeue: cannot allocate memory "" 2017-04-25_18:39:04.88547 2017/04/25 18:39:04 Error running "/usr/bin/squeue" ["squeue" "--all" "--format=%j"]: fork/exec /usr/bin/squeue: cannot allocate memory "" 2017-04-25_18:39:18.88524 2017/04/25 18:39:18 Error running "/usr/bin/squeue" ["squeue" "--all" "--format=%j"]: fork/exec /usr/bin/squeue: cannot allocate memory "" 2017-04-25_18:39:19.88518 2017/04/25 18:39:19 Error running "/usr/bin/squeue" ["squeue" "--all" "--format=%j"]: fork/exec /usr/bin/squeue: cannot allocate memory ""
Updated by Ward Vandewege over 7 years ago
Stopped all runit services. Restarted nginx. Restarted all runit services minus crunch-dispatch-slurm.
A few hours later, workbench still timing out on the dashboard page. Other pages work fine. See also bug #11548.
Looks like we have some sort of serious memory leak in the API server now:
qr1hi:/var/www/arvados-api/shared/log# passenger-status Version : 5.1.1 Date : 2017-04-26 00:03:23 +0000 Instance: pnzh97Yf (nginx/1.10.2 Phusion_Passenger/5.1.1) ----------- General information ----------- Max pool size : 128 App groups : 1 Processes : 10 Requests in top-level queue : 0 ----------- Application groups ----------- /var/www/qr1hi.arvadosapi.com/current/public: App root: /var/www/qr1hi.arvadosapi.com/current Requests in queue: 0 * PID: 18986 Sessions: 0 Processed: 2181 Uptime: 1h 57m 21s CPU: 47% Memory : 10278M Last used: 2s ago * PID: 18995 Sessions: 0 Processed: 3694 Uptime: 1h 57m 21s CPU: 10% Memory : 260M Last used: 14s ago * PID: 19445 Sessions: 0 Processed: 993 Uptime: 1h 56m 37s CPU: 3% Memory : 241M Last used: 1m 18s a * PID: 48077 Sessions: 0 Processed: 534 Uptime: 1h 28m 17s CPU: 1% Memory : 221M Last used: 1m 21s a * PID: 48089 Sessions: 0 Processed: 151 Uptime: 1h 28m 16s CPU: 0% Memory : 157M Last used: 1m 21s a * PID: 48097 Sessions: 0 Processed: 47 Uptime: 1h 28m 16s CPU: 0% Memory : 195M Last used: 5m 21s a * PID: 48639 Sessions: 0 Processed: 29 Uptime: 1h 27m 51s CPU: 0% Memory : 62M Last used: 13m 21s * PID: 49052 Sessions: 0 Processed: 6 Uptime: 1h 27m 21s CPU: 0% Memory : 45M Last used: 19m 21s * PID: 49073 Sessions: 0 Processed: 2 Uptime: 1h 27m 20s CPU: 0% Memory : 41M Last used: 19m 21s * PID: 50689 Sessions: 0 Processed: 0 Uptime: 19m 16s CPU: 0% Memory : 77M Last used: 19m 16s ago qr1hi:/var/www/arvados-api/shared/log# free total used free shared buffers cached Mem: 28811760 15024744 13787016 156080 46392 994344 -/+ buffers/cache: 13984008 14827752 Swap: 0 0 0
Looking further, perhaps this has to do with container requests. Postgres slow query log:
2017-04-26 00:31:52 UTC LOG: duration: 10064.070 ms execute <unnamed>: SELECT "container_requests".* FROM "container_requests" WHERE (container_requests.requesting_container_uuid='qr1hi-dz642-khl9hd88hk88a7m') ORDER BY container_requests.modified_at desc, container_requests.uuid LIMIT 1000 OFFSET 0
Okay. Not that slow, but still slower than it should be. Explain analyze.... huh that looks ok:
arvados_production=# explain analyze SELECT container_requests.* FROM container_requests WHERE (container_requests.requesting_container_uuid='qr1hi-dz642-khl9hd88hk88a7m') ORDER BY container_requests.modified_at desc, container_requests.uuid LIMIT 1000 OFFSET 0; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------ Limit (cost=23.13..23.16 rows=12 width=1507) (actual time=0.207..0.245 rows=15 loops=1) -> Sort (cost=23.13..23.16 rows=12 width=1507) (actual time=0.168..0.184 rows=15 loops=1) Sort Key: modified_at, uuid Sort Method: quicksort Memory: 40kB -> Seq Scan on container_requests (cost=0.00..22.91 rows=12 width=1507) (actual time=0.090..0.130 rows=15 loops=1) Filter: ((requesting_container_uuid)::text = 'qr1hi-dz642-khl9hd88hk88a7m'::text) Rows Removed by Filter: 141 Total runtime: 0.313 ms (8 rows) arvados_production=# select count(id) from container_requests; count ------- 156 (1 row)
How can this possibly be so slow in the production logs? Let's poke at one:
arvados_production=# SELECT container_requests.* FROM container_requests WHERE (container_requests.requesting_container_uuid='qr1hi-dz642-khl9hd88hk88a7m') ORDER BY container_requests.modified_at desc, container_requests.uuid limit 1 \g /tmp/sample_cr.sql # ls -laFh /tmp/sample_cr.sql -rw-rw-r-- 1 postgres postgres 217M Apr 26 00:40 /tmp/sample_cr.sql
What the... No wonder we have sluggish Postgres queries. No wonder we see Ruby memory use explosions.
arvados_production=# select id,length(description) as description, length(properties) as properties, length(mounts) as mounts,length(runtime_constraints) as runtime_constraints, length(environment) as environment, length(command) as command, length(filters) as filters, length(scheduling_parameters) as scheduling_parameters from container_requests order by id \g /tmp/lengths $ cat /tmp/lengths id | description | properties | mounts | runtime_constraints | environment | command | filters | scheduling_parameters -----+-------------+------------+----------+---------------------+-------------+---------+---------+----------------------- 1 | | 7 | 838 | 57 | 42 | 425 | | 7 2 | | 7 | 68955 | 65 | 7 | 193 | | 7 3 | | 2 | 805 | 55 | 41 | 319 | | 2 4 | | 2 | 805 | 55 | 41 | 319 | | 2 5 | | 2 | 805 | 55 | 41 | 319 | | 2 6 | | 2 | 745 | 55 | 41 | 307 | | 2 7 | | 2 | 341 | 55 | 41 | 312 | | 2 8 | | 2 | 337 | 55 | 41 | 122 | | 2 9 | | 2 | 337 | 55 | 41 | 127 | | 2 10 | | 2 | 310 | 55 | 41 | 17 | | 2 11 | | 2 | 247 | 55 | 41 | 61 | | 2 12 | | 2 | 273 | 55 | 41 | 61 | | 2 13 | | 2 | 244 | 55 | 41 | 61 | | 2 14 | | 2 | 563 | 66 | 2 | 176 | | 2 15 | | 2 | 554 | 66 | 2 | 176 | | 2 16 | | 2 | 556 | 66 | 2 | 176 | | 2 ... 120 | | 2 | 1314 | 55 | 41 | 100 | | 2 121 | | 2 | 1107 | 55 | 41 | 43 | | 2 122 | | 2 | 301 | 55 | 41 | 88 | | 2 123 | | 2 | 318 | 55 | 41 | 22 | | 2 124 | | 2 | 246 | 55 | 41 | 35 | | 2 125 | | 2 | 774 | 55 | 41 | 90 | | 2 126 | | 2 | 91 | 55 | 41 | 12 | | 2 127 | | 2 | 33 | 55 | 41 | 128 | | 2 128 | | 2 | 33 | 55 | 41 | 128 | | 2 129 | | 2 | 874 | 66 | 2 | 176 | | 2 130 | | 2 | 95 | 55 | 41 | 12 | | 2 131 | | 2 | 314 | 55 | 41 | 25 | | 2 132 | | 2 | 1249 | 55 | 41 | 169 | | 2 133 | | 2 | 829 | 66 | 2 | 176 | | 2 134 | | 2 | 550 | 66 | 2 | 176 | | 2 135 | | 2 | 86 | 55 | 69 | 35 | | 2 136 | | 2 | 869 | 66 | 2 | 176 | | 2 137 | | 22 | 556 | 66 | 2 | 176 | | 2 138 | | 2 | 88634 | 39 | 2 | 176 | | 2 139 | | 2 | 88601 | 39 | 2 | 176 | | 2 140 | | 2 | 83766 | 39 | 2 | 176 | | 2 141 | | 2 | 83766 | 39 | 2 | 176 | | 2 142 | | 2 | 75637943 | 59 | 41 | 256 | | 2 143 | | 2 | 75637943 | 59 | 41 | 256 | | 2 144 | | 2 | 75637943 | 59 | 41 | 256 | | 2 145 | | 2 | 75637943 | 59 | 41 | 256 | | 2 146 | | 2 | 75637943 | 59 | 41 | 256 | | 2 147 | | 2 | 75637943 | 59 | 41 | 256 | | 2 148 | | 2 | 75637943 | 59 | 41 | 256 | | 2 149 | | 2 | 75637943 | 59 | 41 | 256 | | 2 150 | | 2 | 75637943 | 59 | 41 | 256 | | 2 151 | | 2 | 75637943 | 59 | 41 | 256 | | 2 154 | | 2 | 75637943 | 59 | 41 | 256 | | 2 155 | | 2 | 75637943 | 59 | 41 | 256 | | 2 156 | | 2 | 75637943 | 59 | 41 | 256 | | 2 157 | | 2 | 75637943 | 59 | 41 | 256 | | 2 158 | | 2 | 75637943 | 59 | 41 | 256 | | 2 (156 rows)
Updated by Ward Vandewege over 7 years ago
I deleted the offending container requests. Arv threw lots of errors at me but eventually did the job, and now things are A-OK again. Workbench fully functional.
wardv@shell.qr1hi:~$ cat del-cr.sh arv container_request delete --uuid qr1hi-xvhdp-amnlx2o1ij72384 arv container_request delete --uuid qr1hi-xvhdp-0v9iioehkktegq7 arv container_request delete --uuid qr1hi-xvhdp-gdoro474x6mlzf0 arv container_request delete --uuid qr1hi-xvhdp-ggar6xs8wdq6z5f arv container_request delete --uuid qr1hi-xvhdp-s77gna8fborksau arv container_request delete --uuid qr1hi-xvhdp-d6e0tw34awdey0n arv container_request delete --uuid qr1hi-xvhdp-mx9r8k3kqlnputi arv container_request delete --uuid qr1hi-xvhdp-tddgn6gtbnacvtu arv container_request delete --uuid qr1hi-xvhdp-uogsswwgfkk03b8 arv container_request delete --uuid qr1hi-xvhdp-sp0tiax8g97zssv arv container_request delete --uuid qr1hi-xvhdp-05cdb5pu829hjki arv container_request delete --uuid qr1hi-xvhdp-5x4ub0qf9plzira arv container_request delete --uuid qr1hi-xvhdp-7h7093oksr65sum arv container_request delete --uuid qr1hi-xvhdp-gfzayezhzrrj45l arv container_request delete --uuid qr1hi-xvhdp-99farw9kcikc38s wardv@shell.qr1hi:~$ sh del-cr.sh Error: Path not found Error: Path not found Error: Path not found /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:158:in `rbuf_fill': Net::ReadTimeout (Faraday::TimeoutError) from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:136:in `readuntil' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:146:in `readline' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http/response.rb:40:in `read_status_line' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http/response.rb:29:in `read_new' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1437:in `block in transport_request' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1434:in `catch' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1434:in `transport_request' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1407:in `request' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1400:in `block in request' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:853:in `start' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1398:in `request' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:80:in `perform_request' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:38:in `block in call' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:85:in `with_net_http_connection' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:33:in `call' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/response.rb:8:in `call' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client/request.rb:163:in `send' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:648:in `block (2 levels) in execute!' from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable/retry.rb:27:in `perform' from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable.rb:15:in `retriable' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:645:in `block in execute!' from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable/retry.rb:27:in `perform' from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable.rb:15:in `retriable' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:636:in `execute!' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:679:in `execute' from /usr/local/rvm/gems/ruby-2.3.3/gems/arvados-cli-0.1.20170417140106/bin/arv:71:in `execute' from /usr/local/rvm/gems/ruby-2.3.3/gems/arvados-cli-0.1.20170417140106/bin/arv:709:in `<top (required)>' from /usr/local/rvm/gems/ruby-2.3.3/bin/arv:22:in `load' from /usr/local/rvm/gems/ruby-2.3.3/bin/arv:22:in `<main>' from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in `eval' from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in `<main>' /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:158:in `rbuf_fill': Net::ReadTimeout (Faraday::TimeoutError) from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:136:in `readuntil' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:146:in `readline' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http/response.rb:40:in `read_status_line' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http/response.rb:29:in `read_new' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1437:in `block in transport_request' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1434:in `catch' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1434:in `transport_request' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1407:in `request' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1400:in `block in request' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:853:in `start' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1398:in `request' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:80:in `perform_request' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:38:in `block in call' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:85:in `with_net_http_connection' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:33:in `call' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/response.rb:8:in `call' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client/request.rb:163:in `send' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:648:in `block (2 levels) in execute!' from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable/retry.rb:27:in `perform' from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable.rb:15:in `retriable' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:645:in `block in execute!' from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable/retry.rb:27:in `perform' from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable.rb:15:in `retriable' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:636:in `execute!' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:679:in `execute' from /usr/local/rvm/gems/ruby-2.3.3/gems/arvados-cli-0.1.20170417140106/bin/arv:71:in `execute' from /usr/local/rvm/gems/ruby-2.3.3/gems/arvados-cli-0.1.20170417140106/bin/arv:709:in `<top (required)>' from /usr/local/rvm/gems/ruby-2.3.3/bin/arv:22:in `load' from /usr/local/rvm/gems/ruby-2.3.3/bin/arv:22:in `<main>' from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in `eval' from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in `<main>' Error: Path not found Error: Path not found /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:158:in `rbuf_fill': Net::ReadTimeout (Faraday::TimeoutError) from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:136:in `readuntil' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:146:in `readline' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http/response.rb:40:in `read_status_line' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http/response.rb:29:in `read_new' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1437:in `block in transport_request' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1434:in `catch' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1434:in `transport_request' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1407:in `request' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1400:in `block in request' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:853:in `start' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1398:in `request' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:80:in `perform_request' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:38:in `block in call' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:85:in `with_net_http_connection' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:33:in `call' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/response.rb:8:in `call' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client/request.rb:163:in `send' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:648:in `block (2 levels) in execute!' from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable/retry.rb:27:in `perform' from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable.rb:15:in `retriable' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:645:in `block in execute!' from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable/retry.rb:27:in `perform' from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable.rb:15:in `retriable' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:636:in `execute!' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:679:in `execute' from /usr/local/rvm/gems/ruby-2.3.3/gems/arvados-cli-0.1.20170417140106/bin/arv:71:in `execute' from /usr/local/rvm/gems/ruby-2.3.3/gems/arvados-cli-0.1.20170417140106/bin/arv:709:in `<top (required)>' from /usr/local/rvm/gems/ruby-2.3.3/bin/arv:22:in `load' from /usr/local/rvm/gems/ruby-2.3.3/bin/arv:22:in `<main>' from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in `eval' from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in `<main>' /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:158:in `rbuf_fill': Net::ReadTimeout (Faraday::TimeoutError) from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:136:in `readuntil' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:146:in `readline' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http/response.rb:40:in `read_status_line' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http/response.rb:29:in `read_new' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1437:in `block in transport_request' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1434:in `catch' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1434:in `transport_request' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1407:in `request' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1400:in `block in request' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:853:in `start' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1398:in `request' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:80:in `perform_request' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:38:in `block in call' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:85:in `with_net_http_connection' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:33:in `call' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/response.rb:8:in `call' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client/request.rb:163:in `send' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:648:in `block (2 levels) in execute!' from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable/retry.rb:27:in `perform' from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable.rb:15:in `retriable' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:645:in `block in execute!' from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable/retry.rb:27:in `perform' from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable.rb:15:in `retriable' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:636:in `execute!' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:679:in `execute' from /usr/local/rvm/gems/ruby-2.3.3/gems/arvados-cli-0.1.20170417140106/bin/arv:71:in `execute' from /usr/local/rvm/gems/ruby-2.3.3/gems/arvados-cli-0.1.20170417140106/bin/arv:709:in `<top (required)>' from /usr/local/rvm/gems/ruby-2.3.3/bin/arv:22:in `load' from /usr/local/rvm/gems/ruby-2.3.3/bin/arv:22:in `<main>' from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in `eval' from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in `<main>' Error: Path not found /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:158:in `rbuf_fill': Net::ReadTimeout (Faraday::TimeoutError) from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:136:in `readuntil' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:146:in `readline' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http/response.rb:40:in `read_status_line' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http/response.rb:29:in `read_new' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1437:in `block in transport_request' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1434:in `catch' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1434:in `transport_request' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1407:in `request' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1400:in `block in request' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:853:in `start' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1398:in `request' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:80:in `perform_request' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:38:in `block in call' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:85:in `with_net_http_connection' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:33:in `call' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/response.rb:8:in `call' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client/request.rb:163:in `send' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:648:in `block (2 levels) in execute!' from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable/retry.rb:27:in `perform' from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable.rb:15:in `retriable' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:645:in `block in execute!' from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable/retry.rb:27:in `perform' from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable.rb:15:in `retriable' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:636:in `execute!' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:679:in `execute' from /usr/local/rvm/gems/ruby-2.3.3/gems/arvados-cli-0.1.20170417140106/bin/arv:71:in `execute' from /usr/local/rvm/gems/ruby-2.3.3/gems/arvados-cli-0.1.20170417140106/bin/arv:709:in `<top (required)>' from /usr/local/rvm/gems/ruby-2.3.3/bin/arv:22:in `load' from /usr/local/rvm/gems/ruby-2.3.3/bin/arv:22:in `<main>' from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in `eval' from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in `<main>' /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:158:in `rbuf_fill': Net::ReadTimeout (Faraday::TimeoutError) from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:136:in `readuntil' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:146:in `readline' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http/response.rb:40:in `read_status_line' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http/response.rb:29:in `read_new' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1437:in `block in transport_request' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1434:in `catch' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1434:in `transport_request' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1407:in `request' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1400:in `block in request' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:853:in `start' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1398:in `request' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:80:in `perform_request' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:38:in `block in call' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:85:in `with_net_http_connection' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:33:in `call' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/response.rb:8:in `call' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client/request.rb:163:in `send' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:648:in `block (2 levels) in execute!' from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable/retry.rb:27:in `perform' from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable.rb:15:in `retriable' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:645:in `block in execute!' from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable/retry.rb:27:in `perform' from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable.rb:15:in `retriable' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:636:in `execute!' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:679:in `execute' from /usr/local/rvm/gems/ruby-2.3.3/gems/arvados-cli-0.1.20170417140106/bin/arv:71:in `execute' from /usr/local/rvm/gems/ruby-2.3.3/gems/arvados-cli-0.1.20170417140106/bin/arv:709:in `<top (required)>' from /usr/local/rvm/gems/ruby-2.3.3/bin/arv:22:in `load' from /usr/local/rvm/gems/ruby-2.3.3/bin/arv:22:in `<main>' from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in `eval' from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in `<main>' /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:158:in `rbuf_fill': Net::ReadTimeout (Faraday::TimeoutError) from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:136:in `readuntil' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:146:in `readline' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http/response.rb:40:in `read_status_line' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http/response.rb:29:in `read_new' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1437:in `block in transport_request' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1434:in `catch' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1434:in `transport_request' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1407:in `request' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1400:in `block in request' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:853:in `start' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1398:in `request' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:80:in `perform_request' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:38:in `block in call' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:85:in `with_net_http_connection' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:33:in `call' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/response.rb:8:in `call' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client/request.rb:163:in `send' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:648:in `block (2 levels) in execute!' from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable/retry.rb:27:in `perform' from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable.rb:15:in `retriable' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:645:in `block in execute!' from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable/retry.rb:27:in `perform' from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable.rb:15:in `retriable' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:636:in `execute!' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:679:in `execute' from /usr/local/rvm/gems/ruby-2.3.3/gems/arvados-cli-0.1.20170417140106/bin/arv:71:in `execute' from /usr/local/rvm/gems/ruby-2.3.3/gems/arvados-cli-0.1.20170417140106/bin/arv:709:in `<top (required)>' from /usr/local/rvm/gems/ruby-2.3.3/bin/arv:22:in `load' from /usr/local/rvm/gems/ruby-2.3.3/bin/arv:22:in `<main>' from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in `eval' from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in `<main>' /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:158:in `rbuf_fill': Net::ReadTimeout (Faraday::TimeoutError) from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:136:in `readuntil' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:146:in `readline' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http/response.rb:40:in `read_status_line' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http/response.rb:29:in `read_new' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1437:in `block in transport_request' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1434:in `catch' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1434:in `transport_request' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1407:in `request' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1400:in `block in request' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:853:in `start' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1398:in `request' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:80:in `perform_request' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:38:in `block in call' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:85:in `with_net_http_connection' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:33:in `call' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/response.rb:8:in `call' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client/request.rb:163:in `send' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:648:in `block (2 levels) in execute!' from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable/retry.rb:27:in `perform' from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable.rb:15:in `retriable' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:645:in `block in execute!' from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable/retry.rb:27:in `perform' from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable.rb:15:in `retriable' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:636:in `execute!' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:679:in `execute' from /usr/local/rvm/gems/ruby-2.3.3/gems/arvados-cli-0.1.20170417140106/bin/arv:71:in `execute' from /usr/local/rvm/gems/ruby-2.3.3/gems/arvados-cli-0.1.20170417140106/bin/arv:709:in `<top (required)>' from /usr/local/rvm/gems/ruby-2.3.3/bin/arv:22:in `load' from /usr/local/rvm/gems/ruby-2.3.3/bin/arv:22:in `<main>' from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in `eval' from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in `<main>' /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:158:in `rbuf_fill': Net::ReadTimeout (Faraday::TimeoutError) from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:136:in `readuntil' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:146:in `readline' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http/response.rb:40:in `read_status_line' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http/response.rb:29:in `read_new' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1437:in `block in transport_request' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1434:in `catch' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1434:in `transport_request' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1407:in `request' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1400:in `block in request' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:853:in `start' from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/http.rb:1398:in `request' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:80:in `perform_request' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:38:in `block in call' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:85:in `with_net_http_connection' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/adapter/net_http.rb:33:in `call' from /usr/local/rvm/gems/ruby-2.3.3/gems/faraday-0.12.0.1/lib/faraday/response.rb:8:in `call' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client/request.rb:163:in `send' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:648:in `block (2 levels) in execute!' from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable/retry.rb:27:in `perform' from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable.rb:15:in `retriable' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:645:in `block in execute!' from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable/retry.rb:27:in `perform' from /usr/local/rvm/gems/ruby-2.3.3/gems/retriable-1.4.1/lib/retriable.rb:15:in `retriable' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:636:in `execute!' from /usr/local/rvm/gems/ruby-2.3.3/gems/google-api-client-0.8.7/lib/google/api_client.rb:679:in `execute' from /usr/local/rvm/gems/ruby-2.3.3/gems/arvados-cli-0.1.20170417140106/bin/arv:71:in `execute' from /usr/local/rvm/gems/ruby-2.3.3/gems/arvados-cli-0.1.20170417140106/bin/arv:709:in `<top (required)>' from /usr/local/rvm/gems/ruby-2.3.3/bin/arv:22:in `load' from /usr/local/rvm/gems/ruby-2.3.3/bin/arv:22:in `<main>' from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in `eval' from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in `<main>'
Updated by Peter Amstutz over 7 years ago
- Subject changed from crunch-dispatch-slurm using huge ram to [API] Reject huge serialized fields
- Description updated (diff)
Updated by Tom Clegg over 7 years ago
AFAIK apiserver regularly handles >75 MB serialized "properties" fields in log records, so I'm not convinced that the size of the serialized data is the real problem here.
Updated by Peter Amstutz over 7 years ago
Suggested test case to avoid container-specific issues:
Try creating a collection record with a complex "properties" field (e.g. 100,000 dict entries which are themselves objects) and see how that affects the request profile.
Updated by Tom Clegg over 7 years ago
Peter Amstutz wrote:
Try creating a collection record with a complex "properties" field (e.g. 100,000 dict entries which are themselves objects) and see how that affects the request profile.
Perhaps add to source:sdk/python/tests/test_benchmark_collections.py
Updated by Tom Clegg over 7 years ago
- Subject changed from [API] Reject huge serialized fields to [API] Excessive memory and time consumed by API and dispatch when container fields (like "mounts") are large
Updated by Peter Amstutz over 7 years ago
First time is time to create record, second time is time to fetch record.
Properties consisting of one 10 MB string:
$ python huge.py payload about 10088844 14.6757330894 34t0i-7a9it-ko3a9mc8w2k7xrq 1.81077194214
Properties consisting of about 8 MB of JSON, as a dict with 100,000 objects:
$ python huge.py payload about 8088854 54.548582077 34t0i-7a9it-b6t3vd2x5xqi8gc 19.4000740051
So there's about a 10x performance difference between a single huge string, and a complex serialized record. (Likely also a memory usage difference on API server, but I didn't benchmarked that). This is consistent with the performance issues seen with the huge "mounts" entry.
Updated by Tom Clegg over 7 years ago
Indeed. Given that result, I don't think a column size limit seems like a good strategy for avoiding slowness.
Seems like the priorities should be- Fix #11546 so lock/unlock don't block other queries happening at the same time
- Avoid sending the mounts field (and other fields not used by dispatch) in lock/unlock responses (perhaps using the "select" parameter)
- Avoid loading the mounts field (and other fields not used by dispatch) when dispatch uses get/index APIs
With those changes, we'd just have the unavoidable costs of the client submitting the large mounts hash and crunch-run retrieving it on the worker node, instead of an explosion of unnecessary serialize/transfer/deserialize sequences.
Updated by Peter Amstutz over 7 years ago
Instead of setting a column size limit, we can scan the JSON serialization to get an object count:
def count_js_objects n objcount = 0 instr = false skip = false n.each_char do |c| if skip skip = false else case c when "{", "[" objcount += 1 when '\\' skip = true when '"' if instr objcount += 1 end instr = !instr end end end objcount end
Updated by Tom Morris about 7 years ago
- Target version set to Arvados Future Sprints
Updated by Joshua Randall over 6 years ago
It looks like we are currently suffering this very same issue. Some of our container requests are very large, and our api server instance recently started crashing repeatedly due to memory exhaustion.
We initially tried lowering the number of workers, but there seems to be no upper limit to the amount of memory a single worker can take, as they just get larger and larger until the OOM is invoked.
Our workaround is to run a cron job once per minute that kills "fat passengers":
root@arvados-master-eglyx:/# cat /etc/cron.d/arvados_api_fat_passenger_killer * * * * * root /var/www/arvados-api/current/script/arvados_api_fat_passenger_killer.sh root@arvados-master-eglyx:/# cat /var/www/arvados-api/current/script/arvados_api_fat_passenger_killer.sh #!/bin/bash export PASSENGER_INSTANCE_REGISTRY_DIR=/var/run/nginx/passenger date=$(date) echo "arvados_api_fat_passenger_killer running at ${date}" >> /var/log/arvados_api_fat_passenger_killer.log fat_passenger_pids=$(/usr/sbin/passenger-memory-stats | grep 'Passenger AppPreloader: /var/www/arvados-api/current' | grep -v grep | awk '$4>5000 {print $4, $1}' | sort -rg | awk '{print $2}') 2> /dev/null for fat_passenger_pid in ${fat_passenger_pids}; do echo "killing ${fat_passenger_pid}" kill -SIGUSR1 ${fat_passenger_pid} done >> /var/log/arvados_api_fat_passenger_killer.log
Updated by Peter Amstutz over 6 years ago
Are the container requests large because they have a very large workflow, or because they have a very large number of keep mounts?
Updated by Joshua Randall over 6 years ago
We have both kinds. For example, our a-c-r runner containers have large workflow inputs that is mostly in cwl.input.json and can take ~45s to retrieve:
mercury@arvados-shell-eglyx:~$ ls -l eglyx-xvhdp-d52pwdy7flpucnm.json -rw-r--r-- 1 mercury hgi 21758914 Jun 12 13:56 eglyx-xvhdp-d52pwdy7flpucnm.json mercury@arvados-shell-eglyx:~$ for mkey in $(cat eglyx-xvhdp-d52pwdy7flpucnm.json | jq -r '.mounts | keys | join(" ")'); do echo ${mkey}; (cat eglyx-xvhdp-d52pwdy7flpucnm.json | jq '.mounts."'${mkey}'"' | wc); done /var/lib/cwl/cwl.input.json 410322 673941 23385057 /var/lib/cwl/workflow.json 3130 7849 113053 /var/spool/cwl 4 6 47 stdout 4 6 65
But others have a (very) large number of mounts and I cannot get you an example of one because we can rarely if ever receive them from the API server before it either times out or crashes due to memory exhaustion.
Updated by Joshua Randall over 6 years ago
N.B. we have updated our fat passenger killer after discovering that advice regarding doing passenger instance graceful restarts with SIGUSR1 is out of date. As of passenger 5, that no longer causes a graceful restart, and the advice appears to be to use `passenger-config detach-process` instead. This change has resolved some bad gateway errors we were seeing with the previous approach:
# cat /var/www/arvados-api/current/script/arvados_api_fat_passenger_killer.sh #!/bin/bash export PASSENGER_INSTANCE_REGISTRY_DIR=/var/run/nginx/passenger date=$(date) echo "arvados_api_fat_passenger_killer running at ${date}" >> /var/log/arvados_api_fat_passenger_killer.log fat_passenger_pids=$(/usr/sbin/passenger-memory-stats | grep 'Passenger AppPreloader: /var/www/arvados-api/current' | grep -v grep | awk '$4>5000 {print $4, $1}' | sort -rg | awk '{print $2}') 2> /dev/null for fat_passenger_pid in ${fat_passenger_pids}; do echo "detaching passenger process ${fat_passenger_pid}" passenger-config detach-process ${fat_passenger_pid} done >> /var/log/arvados_api_fat_passenger_killer.log
Updated by Joshua Randall over 6 years ago
Another example.
From passenger.log - I believe this was crunch attempting to update the container to say it has failed:
{"method":"PUT","path":"/arvados/v1/containers/eglyx-dz642-jrft0qq1n81158b","format":"html","controller":"Arvados::V1::ContainersController","action":"update","status":200,"duration":9006160.12,"view":0.44,"db":5749338.9,"request_id":"req-d8ly8c41pp35sa80flpc","client_ipaddr":"172.27.16.250","client_auth":"eglyx-gj3su-xo6zhx218uqho7j","params":{"container":"{\"exit_code\":1,\"log\":\"ec8db9aa40ff0b393bcdb9e65cd144a0+433\",\"output\":\"5551f5d7ec57a8aa9758d14a976e87e9+57\",\"state\":\"Complete\"}"},"@timestamp":"2018-06-13T23:17:35.509049624Z","@version":"1","message":"[200] PUT /arvados/v1/containers/eglyx-dz642-jrft0qq1n81158b (Arvados::V1::ContainersController#update)"}
If I am reading that correctly, that appears to be 95m querying the database and an additional 54m processing within the API server.
arvados_api_production=# select uuid, char_length(mounts) from containers where uuid = 'eglyx-dz642-jrft0qq1n81158b'; uuid | char_length -----------------------------+------------- eglyx-dz642-jrft0qq1n81158b | 108682927 (1 row)
This particular one appears to be another runner container (so, with only 4 entries in mounts). The size of the entry is down to a 108.6MB cwl.input.json
Updated by Tom Clegg over 6 years ago
There are several queries in container/container_request code that retrieve the entire object but only need a few fields like priority, state, and requesting_container_uuid. Adding select() to a few places might go a long way.
Updated by Tom Clegg over 6 years ago
ArvadosModel#log_start_state spends significant time (and surely memory) doing Marshal.load(Marshal.dump(attributes)) + Marshal.load(Marshal.dump(logged_attributes)) before each update, which is especially wasteful in transactions that don't end up modifying anything (e.g., update fails) and on systems where audit logs are not needed.
KindAndEtag#etag calls attributes.inspect, which seems to be much slower than other alternatives like Marshal.dump(attributes) and Oj.dump(attributes). (One data point: changing this to Marshal.dump() reduced collections#get(9tee4-4zz18-fbfj9efjpq282cz) request duration from 6s to 5s.)
Updated by Peter Amstutz over 6 years ago
We recently found a fixed an issue where the version of the Oj (json marshaller) gem we were using had a bug that resulted in excessive memory usage. Perhaps is a root cause of this issue as well?
Updated by Peter Amstutz over 3 years ago
- Target version deleted (
Arvados Future Sprints)