Bug #17589
closed[arv-mount] hangs on ls keep/home/, slowly increasing memory
Description
This is happening on Tordo (AWS test cluster), at version 2.2.0~dev20210426141415-1. Tordo is in a LoginCluster federation with ce8i5 (Azure test cluster) as the login cluster.
As a superuser, on the tordo shell node, I mounted keep with `arv-mount keep --debug --foreground`. I then ran `ls keep/home`. Workbench2 says that there are 52074 items in my home project.
The `ls` command hangs indefinitely, with `arv-mount` doing seemingly nothing. The output from arv-mount is just this:
unique: 10, opcode: LOOKUP (1), nodeid: 1, insize: 45, pid: 29513 2021-04-27 01:35:09 arvados.arvados_fuse[29440] DEBUG: arv-mount lookup: parent_inode 1 name 'home' inode 5 unique: 10, success, outsize: 144 unique: 11, opcode: LOOKUP (1), nodeid: 1, insize: 45, pid: 29513 2021-04-27 01:35:09 arvados.arvados_fuse[29440] DEBUG: arv-mount lookup: parent_inode 1 name 'home' inode 5 unique: 11, success, outsize: 144 unique: 12, opcode: OPENDIR (27), nodeid: 5, insize: 48, pid: 29513 2021-04-27 01:35:09 arvados.arvados_fuse[29440] DEBUG: arv-mount opendir: inode 5 2021-04-27 01:35:09 arvados.arvados_fuse[29440] DEBUG: InodeCache touched inode 5 (size 0) (uuid ce8i5-tpzed-xo2k4i24bjzwedw) total now 0 (1 entries) 2021-04-27 01:35:09 arvados.arvados_fuse[29440] DEBUG: InodeCache touched inode 8 (size 0) (uuid ce8i5-tpzed-xo2k4i24bjzwedw) total now 0 (2 entries) (no further output)
arv-mount slowly consumes more and more memory until the OOM killer takes it out.
While in this state, with `ls` hanging, accessing specific collections via `keep/by_id` works just fine.
Files
Related issues
Updated by Ward Vandewege over 3 years ago
- Category set to CWL
- Description updated (diff)
Updated by Ward Vandewege over 3 years ago
The controller logs show a long list of group/contents requests, with an ever-increasing offset.
The manual query indicates that there should be 199695 items available.
ward@shell:~$ arv group contents -u ce8i5-tpzed-xo2k4i24bjzwedw -f '[["uuid","is_a","arvados#collection"]]' ... "items_available":199695, "kind":"arvados#objectList", "limit":100, "offset":0 }
But somehow arv-mount just keeps going, and it keeps getting data back:
... Apr 27 01:51:30 tordo.arvadosapi.com arvados-controller[6632]: {"PID":6632,"RequestID":"req-wur4oz2c6j9gtne2oyzy","level":"info","msg":"request","remoteAddr":"127.0.0.1:60938","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=248000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T01:51:30.771006945Z"} Apr 27 01:51:34 tordo.arvadosapi.com arvados-controller[6632]: {"PID":6632,"RequestID":"req-wur4oz2c6j9gtne2oyzy","level":"info","msg":"response","remoteAddr":"127.0.0.1:60938","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=248000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":979327,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T01:51:34.967312364Z","timeToStatus":4.196000,"timeTotal":4.196293,"timeWriteBody":0.000293} Apr 27 01:51:34 tordo.arvadosapi.com arvados-controller[6632]: {"PID":6632,"RequestID":"req-ndhd2y6sbt70gf8rzr96","level":"info","msg":"request","remoteAddr":"127.0.0.1:60946","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=249000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T01:51:34.986794610Z"} Apr 27 01:51:39 tordo.arvadosapi.com arvados-controller[6632]: {"PID":6632,"RequestID":"req-ndhd2y6sbt70gf8rzr96","level":"info","msg":"response","remoteAddr":"127.0.0.1:60946","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=249000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":978985,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T01:51:39.069045322Z","timeToStatus":4.081946,"timeTotal":4.082238,"timeWriteBody":0.000292} ...
Trying to reproduce this with the arv cli leads to a different request in the logs:
Apr 27 01:54:06 tordo.arvadosapi.com arvados-controller[6632]: {"PID":6632,"RequestID":"req-8l8zg5vzbsoqq0hhft2v","level":"info","msg":"request","remoteAddr":"127.0.0.1:32906","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"_profile=true\u0026bypass_federation=false\u0026cluster_id=ce8i5-tpzed-xo2k4i24bjzwedw\u0026cluster_id_given=true\u0026count=exact\u0026distinct=false\u0026filters=%5B%5B%22uuid%22%2C%22is_a%22%2C%22arvados%23collection%22%5D%5D\u0026filters_given=true\u0026help=false\u0026include=\u0026include_old_versions=false\u0026include_trash=false\u0026limit=9223372036854775807\u0026limit_given=true\u0026offset=249000\u0026offset_given=true\u0026order=\u0026recursive=false\u0026uuid=\u0026where=","time":"2021-04-27T01:54:06.414215153Z"} Apr 27 01:54:06 tordo.arvadosapi.com arvados-controller[6632]: {"PID":6632,"RequestID":"req-8l8zg5vzbsoqq0hhft2v","level":"info","msg":"response","remoteAddr":"127.0.0.1:32906","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"_profile=true\u0026bypass_federation=false\u0026cluster_id=ce8i5-tpzed-xo2k4i24bjzwedw\u0026cluster_id_given=true\u0026count=exact\u0026distinct=false\u0026filters=%5B%5B%22uuid%22%2C%22is_a%22%2C%22arvados%23collection%22%5D%5D\u0026filters_given=true\u0026help=false\u0026include=\u0026include_old_versions=false\u0026include_trash=false\u0026limit=9223372036854775807\u0026limit_given=true\u0026offset=249000\u0026offset_given=true\u0026order=\u0026recursive=false\u0026uuid=\u0026where=","respBytes":109,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T01:54:06.918316347Z","timeToStatus":0.504083,"timeTotal":0.504092,"timeWriteBody":0.000009}
and also different results:
$ arv group contents -u ce8i5-tpzed-xo2k4i24bjzwedw -f '[["uuid","is_a","arvados#collection"]]' -s 249000 -l 9223372036854775807 { "included":[], "items":[], "items_available":199695, "kind":"arvados#objectList", "limit":1000, "offset":249000 }
This result is as expected; there is something about the arv-mount request that makes it misbehave.
Hmm. I recreated the arv mount request with curl, and that showed that the difference in request parameters results in a difference in items available:
curl -G -sH "Authorization: Bearer OMITTED" \ https://tordo.arvadosapi.com/arvados/v1/groups/contents \ --data-urlencode "offset=249000" \ --data-urlencode "uuid=ce8i5-tpzed-xo2k4i24bjzwedw" \ --data-urlencode 'filters=[["uuid","is_a","arvados#collection"]]' \ --data-urlencode "limit=9223372036854775807" \ --data-urlencode "alt=json"
... "trash_at": null, "uuid": "ce8i5-4zz18-hkn8zfxr23dkw2l", "version": 1 } ], "items_available": 410020, "kind": "arvados#objectList", "limit": 1000, "offset": 249000 }
I went back through the logs and noticed that my test from yesterday did stop after asking for the 410000 batch, and the response was small which lines up with it receiving the last 20 records:
Apr 27 02:32:33 tordo.arvadosapi.com arvados-controller[6632]: {"PID":6632,"RequestID":"req-4bump559x19zfo46cv2i","level":"info","msg":"request","remoteAddr":"127.0.0.1:37416","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=410000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T02:32:33.370600217Z"} Apr 27 02:32:35 tordo.arvadosapi.com arvados-controller[6632]: {"PID":6632,"RequestID":"req-4bump559x19zfo46cv2i","level":"info","msg":"response","remoteAddr":"127.0.0.1:37416","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=410000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":18864,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T02:32:35.591036450Z","timeToStatus":2.220103,"timeTotal":2.220430,"timeWriteBody":0.000327}
So... maybe the groups endpoint is working as expected. Two questions remain:
a) Why is arv-mount doing a request that returns 410,020 items available, while the roughly equivalent version with `arv` returns 199,695 items?
b) why is arv-mount so terribly slow?
Tackling a)
Let's find out why the arv-mount call returns so many more results. Replicating it with a json request to mimic the `arv` command:
curl -G -sH "Authorization: Bearer OMITTED" \ https://tordo.arvadosapi.com/arvados/v1/groups/contents \ --data-urlencode "offset=249000" \ --data-urlencode "uuid=ce8i5-tpzed-xo2k4i24bjzwedw" \ --data-urlencode 'filters=[["uuid","is_a","arvados#collection"]]' \ --data-urlencode "limit=9223372036854775807" \ --data-urlencode "include_trash=false" \ --data-urlencode "cluster_id=ce8i5-tpzed-xo2k4i24bjzwedw" \ --data-urlencode "distinct=false" \ --data-urlencode "include_old_versions=false" \ --data-urlencode "recursive=false" \ --data-urlencode "alt=json"
Apr 27 13:20:17 tordo.arvadosapi.com arvados-controller[6632]: {"PID":6632,"RequestID":"req-18suuu9bl5ndjmmd55v9","level":"info","msg":"request","remoteAddr":"127.0.0.1:38400","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=249000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C%22is_a%22%2C%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026include_trash=false\u0026cluster_id=ce8i5-tpzed-xo2k4i24bjzwedw\u0026distinct=false\u0026include_old_versions=false\u0026recursive=false\u0026alt=json","time":"2021-04-27T13:20:17.329271974Z"} Apr 27 13:20:19 tordo.arvadosapi.com arvados-controller[6632]: {"PID":6632,"RequestID":"req-18suuu9bl5ndjmmd55v9","level":"info","msg":"response","remoteAddr":"127.0.0.1:38400","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=249000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C%22is_a%22%2C%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026include_trash=false\u0026cluster_id=ce8i5-tpzed-xo2k4i24bjzwedw\u0026distinct=false\u0026include_old_versions=false\u0026recursive=false\u0026alt=json","respBytes":978985,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T13:20:19.839597233Z","timeToStatus":2.509989,"timeTotal":2.510312,"timeWriteBody":0.000323}
But that still yields 410,020 items available:
... "trash_at": null, "uuid": "ce8i5-4zz18-hkn8zfxr23dkw2l", "version": 1 } ], "items_available": 410020, "kind": "arvados#objectList", "limit": 1000, "offset": 249000 }
Ah. The difference is that `arv_mount` sends a `uuid` parameter with the uuid of home project, whereas `arv group contents` sends `cluster_id` with that uuid value, and sends an empty `uuid` field:
curl -G -sH "Authorization: Bearer OMITTED" \ https://tordo.arvadosapi.com/arvados/v1/groups/contents \ --data-urlencode "offset=249000" \ --data-urlencode "cluster_id=ce8i5-tpzed-xo2k4i24bjzwedw" \ --data-urlencode 'filters=[["uuid","is_a","arvados#collection"]]' \ --data-urlencode "limit=9223372036854775807" \ --data-urlencode "include_trash=false" \ --data-urlencode "cluster_id=ce8i5-tpzed-xo2k4i24bjzwedw" \ --data-urlencode "distinct=false" \ --data-urlencode "include_old_versions=false" \ --data-urlencode "recursive=false" \ --data-urlencode "alt=json"
... { "included": [], "items": [], "items_available": 199695, "kind": "arvados#objectList", "limit": 1000, "offset": 249000 }
Hrm, it looks like our gems on tordo are out of date:
ward@shell:~$ gem list |grep arv arvados (2.0.4, 1.3.3.20190320201707) arvados-cli (2.0.4) arvados-google-api-client (0.8.7.4) arvados-login-sync (2.0.4)
I fixed that:
shell.tordo:~# gem list|grep arv arvados (2.1.2, 1.3.3.20190320201707) arvados-cli (2.1.2) arvados-google-api-client (0.8.7.4) arvados-login-sync (2.1.2)
(the arvados gem version 1.3.3.20190320201707 is installed because of an outdated dependency in the arvados-login-sync gem, I just pushed a fix for that in 24b0875964b3eff98c12d1c135d8797efcfabfb2).
The updated version of `arv` doesn't make a difference though:
ward@shell:~$ arv group contents -u ce8i5-tpzed-xo2k4i24bjzwedw -f '[["uuid","is_a","arvados#collection"]]' |tail "trash_at":null, "uuid":"tordo-4zz18-13ok9k86ne7rpoo", "version":1 } ], "items_available":199695, "kind":"arvados#objectList", "limit":100, "offset":0 }
Ah, well, duh, that was totally PEBKAC. The cli arguments for `arv` are not very intuitive. `-u` is for cluster_id and `--uuid` is what I want:
ward@shell:~$ arv group contents --uuid ce8i5-tpzed-xo2k4i24bjzwedw -f '[["uuid","is_a","arvados#collection"]]' |tail "trash_at":null, "uuid":"ce8i5-4zz18-qfyy9hyriqbwobm", "version":1 } ], "items_available":410020, "kind":"arvados#objectList", "limit":100, "offset":0 }
Conclusion for a) is that the "approximate" equivalent `arv group contents` command was not equivalent enough. `arv-mount` is doing the right thing.
Tackling b) "why is arv-mount so slow"
It's getting 410,020 items to show. It's requesting 1,000 at a time. It's doing that serially. The controller logs show that the slowness is actually in getting the api server response:
Apr 27 18:04:21 tordo.arvadosapi.com arvados-controller[6632]: {"PID":6632,"RequestID":"req-zwrn6a8vovar784akvmp","level":"info","msg":"request","remoteAddr":"127.0.0.1:45342","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=115000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T18:04:21.061118586Z"} Apr 27 18:04:27 tordo.arvadosapi.com arvados-controller[6632]: {"PID":6632,"RequestID":"req-zwrn6a8vovar784akvmp","level":"info","msg":"response","remoteAddr":"127.0.0.1:45342","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=115000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":978468,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T18:04:27.404367871Z","timeToStatus":6.342963,"timeTotal":6.343237,"timeWriteBody":0.000274} Apr 27 18:04:27 tordo.arvadosapi.com arvados-controller[6632]: {"PID":6632,"RequestID":"req-gi4vdr6jjqw2posxa5sc","level":"info","msg":"request","remoteAddr":"127.0.0.1:45656","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=116000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T18:04:27.425531548Z"} Apr 27 18:04:33 tordo.arvadosapi.com arvados-controller[6632]: {"PID":6632,"RequestID":"req-gi4vdr6jjqw2posxa5sc","level":"info","msg":"response","remoteAddr":"127.0.0.1:45656","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=116000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":979259,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T18:04:33.496201723Z","timeToStatus":6.070386,"timeTotal":6.070660,"timeWriteBody":0.000274}
The delay between request and response for the request with offset 115,000 is 6 seconds. The request for offset 116,000 follows in less than a second.
Hmm. It seems to get the list more than once?
Apr 27 18:22:35 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-ro0y1t0a4iaxgpqsaoba","level":"info","msg":"request","remoteAddr":"127.0.0.1:56958","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=409000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T18:22:35.305997716Z"} Apr 27 18:22:39 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-ro0y1t0a4iaxgpqsaoba","level":"info","msg":"response","remoteAddr":"127.0.0.1:56958","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=409000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":951213,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T18:22:39.410794483Z","timeToStatus":4.101220,"timeTotal":4.104786,"timeWriteBody":0.003566} Apr 27 18:22:39 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-1oug9ifewazn0k684q0g","level":"info","msg":"request","remoteAddr":"127.0.0.1:56990","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=410000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T18:22:39.425739879Z"} Apr 27 18:22:41 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-1oug9ifewazn0k684q0g","level":"info","msg":"response","remoteAddr":"127.0.0.1:56990","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=410000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":293739,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T18:22:41.269223334Z","timeToStatus":1.843310,"timeTotal":1.843468,"timeWriteBody":0.000158} Apr 27 18:23:29 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-8k0i9slizvrbrmw653uv","level":"info","msg":"request","remoteAddr":"127.0.0.1:57458","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/users/ce8i5-tpzed-xo2k4i24bjzwedw","reqQuery":"alt=json","time":"2021-04-27T18:23:29.707199973Z"} Apr 27 18:23:29 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-8k0i9slizvrbrmw653uv","level":"info","msg":"response","remoteAddr":"127.0.0.1:57458","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/users/ce8i5-tpzed-xo2k4i24bjzwedw","reqQuery":"alt=json","respBytes":617,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T18:23:29.772947870Z","timeToStatus":0.065729,"timeTotal":0.065739,"timeWriteBody":0.000010} Apr 27 18:23:29 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-w2f9j09wxy5tikzsqzeh","level":"info","msg":"request","remoteAddr":"127.0.0.1:57460","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=0\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23group%22%5D%2C+%5B%22groups.group_class%22%2C+%22in%22%2C+%5B%22project%22%2C+%22filter%22%5D%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T18:23:29.776418908Z"} Apr 27 18:23:29 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-w2f9j09wxy5tikzsqzeh","level":"info","msg":"response","remoteAddr":"127.0.0.1:57460","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=0\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23group%22%5D%2C+%5B%22groups.group_class%22%2C+%22in%22%2C+%5B%22project%22%2C+%22filter%22%5D%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":2619,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T18:23:29.852613390Z","timeToStatus":0.076160,"timeTotal":0.076186,"timeWriteBody":0.000026} Apr 27 18:23:29 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-n20y8tcbswoteyp5vl31","level":"info","msg":"request","remoteAddr":"127.0.0.1:57462","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=0\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T18:23:29.855664034Z"} Apr 27 18:23:34 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-n20y8tcbswoteyp5vl31","level":"info","msg":"response","remoteAddr":"127.0.0.1:57462","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=0\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":1035365,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T18:23:34.883242160Z","timeToStatus":5.027091,"timeTotal":5.027561,"timeWriteBody":0.000470} Apr 27 18:23:34 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-z5v3as9fh354p9e0h44m","level":"info","msg":"request","remoteAddr":"127.0.0.1:57540","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=1000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T18:23:34.901086368Z"} Apr 27 18:23:38 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-z5v3as9fh354p9e0h44m","level":"info","msg":"response","remoteAddr":"127.0.0.1:57540","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=1000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":1034702,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T18:23:38.490090750Z","timeToStatus":3.588468,"timeTotal":3.588967,"timeWriteBody":0.000499} Apr 27 18:23:38 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-9v2sx9irye3kt73lawp2","level":"info","msg":"request","remoteAddr":"127.0.0.1:57548","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=2000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T18:23:38.506781205Z"} Apr 27 18:23:42 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-9v2sx9irye3kt73lawp2","level":"info","msg":"response","remoteAddr":"127.0.0.1:57548","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=2000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":1034106,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T18:23:42.873203307Z","timeToStatus":4.363532,"timeTotal":4.366400,"timeWriteBody":0.002869}
Wait, and again?
Apr 27 18:56:28 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-5ghnybu6c2338w40vp76","level":"info","msg":"request","remoteAddr":"127.0.0.1:40960","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=410000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T18:56:28.591156111Z"} Apr 27 18:56:32 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-5ghnybu6c2338w40vp76","level":"info","msg":"response","remoteAddr":"127.0.0.1:40960","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=410000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":508538,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T18:56:32.610668168Z","timeToStatus":4.019278,"timeTotal":4.019497,"timeWriteBody":0.000219} Apr 27 18:56:32 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-ls908p4sveqsmhbkvzsi","level":"info","msg":"request","remoteAddr":"127.0.0.1:41004","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=410537\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T18:56:32.624584396Z"} Apr 27 18:56:35 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-ls908p4sveqsmhbkvzsi","level":"info","msg":"response","remoteAddr":"127.0.0.1:41004","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=410537\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":1039,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T18:56:35.356290294Z","timeToStatus":2.731689,"timeTotal":2.731699,"timeWriteBody":0.000010} Apr 27 18:56:44 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-cb3u4lfpxn1szrj16adt","level":"info","msg":"request","remoteAddr":"127.0.0.1:41156","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/users/ce8i5-tpzed-xo2k4i24bjzwedw","reqQuery":"alt=json","time":"2021-04-27T18:56:44.479388280Z"} Apr 27 18:56:44 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-cb3u4lfpxn1szrj16adt","level":"info","msg":"response","remoteAddr":"127.0.0.1:41156","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/users/ce8i5-tpzed-xo2k4i24bjzwedw","reqQuery":"alt=json","respBytes":617,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T18:56:44.534287931Z","timeToStatus":0.054884,"timeTotal":0.054893,"timeWriteBody":0.000009} Apr 27 18:56:44 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-97hz7hlguiighxkuo707","level":"info","msg":"request","remoteAddr":"127.0.0.1:41158","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=0\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23group%22%5D%2C+%5B%22groups.group_class%22%2C+%22in%22%2C+%5B%22project%22%2C+%22filter%22%5D%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T18:56:44.537669320Z"} Apr 27 18:56:44 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-97hz7hlguiighxkuo707","level":"info","msg":"response","remoteAddr":"127.0.0.1:41158","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=0\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23group%22%5D%2C+%5B%22groups.group_class%22%2C+%22in%22%2C+%5B%22project%22%2C+%22filter%22%5D%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":2619,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T18:56:44.595933349Z","timeToStatus":0.058234,"timeTotal":0.058256,"timeWriteBody":0.000023} Apr 27 18:56:44 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-3b5s0uyy13qh8898c5kd","level":"info","msg":"request","remoteAddr":"127.0.0.1:41160","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=0\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T18:56:44.598972874Z"} Apr 27 18:56:48 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-3b5s0uyy13qh8898c5kd","level":"info","msg":"response","remoteAddr":"127.0.0.1:41160","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=0\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":1034857,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T18:56:48.424530662Z","timeToStatus":3.825031,"timeTotal":3.825537,"timeWriteBody":0.000506} Apr 27 18:56:48 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-z8b95n1s1hk0vvkzeyg5","level":"info","msg":"request","remoteAddr":"127.0.0.1:41162","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=1000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T18:56:48.438946395Z"} Apr 27 18:56:52 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-z8b95n1s1hk0vvkzeyg5","level":"info","msg":"response","remoteAddr":"127.0.0.1:41162","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=1000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":1034602,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T18:56:52.048478363Z","timeToStatus":3.609040,"timeTotal":3.609521,"timeWriteBody":0.000481}
So there is really a bug here.
Again:
Apr 27 19:27:47 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-xli9lxyliy5spevpeyl8","level":"info","msg":"request","remoteAddr":"127.0.0.1:54620","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=409000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T19:27:47.240876489Z"} Apr 27 19:27:52 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-xli9lxyliy5spevpeyl8","level":"info","msg":"response","remoteAddr":"127.0.0.1:54620","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=409000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":952550,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T19:27:52.033331703Z","timeToStatus":4.792080,"timeTotal":4.792440,"timeWriteBody":0.000360} Apr 27 19:27:52 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-6hiwi1qyvfh34wlvmymp","level":"info","msg":"request","remoteAddr":"127.0.0.1:54640","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=410000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T19:27:52.063428278Z"} Apr 27 19:27:55 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-6hiwi1qyvfh34wlvmymp","level":"info","msg":"response","remoteAddr":"127.0.0.1:54640","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=410000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":641514,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T19:27:55.974670620Z","timeToStatus":3.910995,"timeTotal":3.911229,"timeWriteBody":0.000234} Apr 27 19:28:02 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-1j7nbd1ewfqhlgv2iovi","level":"info","msg":"request","remoteAddr":"127.0.0.1:54694","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/virtual_machines/tordo-2x53u-80c95rykdjur7t2/logins","reqQuery":"","time":"2021-04-27T19:28:02.995816068Z"} Apr 27 19:28:03 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-1j7nbd1ewfqhlgv2iovi","level":"info","msg":"response","remoteAddr":"127.0.0.1:54694","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/virtual_machines/tordo-2x53u-80c95rykdjur7t2/logins","reqQuery":"","respBytes":3607,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T19:28:03.018414151Z","timeToStatus":0.022430,"timeTotal":0.022587,"timeWriteBody":0.000157} Apr 27 19:28:06 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-dc19hj8glmvuftplt4mw","level":"info","msg":"request","remoteAddr":"127.0.0.1:54734","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/users/ce8i5-tpzed-xo2k4i24bjzwedw","reqQuery":"alt=json","time":"2021-04-27T19:28:06.884991519Z"} Apr 27 19:28:06 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-dc19hj8glmvuftplt4mw","level":"info","msg":"response","remoteAddr":"127.0.0.1:54734","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/users/ce8i5-tpzed-xo2k4i24bjzwedw","reqQuery":"alt=json","respBytes":617,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T19:28:06.949804361Z","timeToStatus":0.064797,"timeTotal":0.064806,"timeWriteBody":0.000010} Apr 27 19:28:06 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-azo21kkotxok4g3vcz5y","level":"info","msg":"request","remoteAddr":"127.0.0.1:54736","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=0\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23group%22%5D%2C+%5B%22groups.group_class%22%2C+%22in%22%2C+%5B%22project%22%2C+%22filter%22%5D%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T19:28:06.953170049Z"} Apr 27 19:28:07 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-azo21kkotxok4g3vcz5y","level":"info","msg":"response","remoteAddr":"127.0.0.1:54736","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=0\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23group%22%5D%2C+%5B%22groups.group_class%22%2C+%22in%22%2C+%5B%22project%22%2C+%22filter%22%5D%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":2619,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T19:28:07.011272128Z","timeToStatus":0.058071,"timeTotal":0.058094,"timeWriteBody":0.000023} Apr 27 19:28:07 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-mdcvwa12r64r1zsuvg07","level":"info","msg":"request","remoteAddr":"127.0.0.1:54738","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=0\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T19:28:07.014198930Z"} Apr 27 19:28:10 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-mdcvwa12r64r1zsuvg07","level":"info","msg":"response","remoteAddr":"127.0.0.1:54738","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=0\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":1034420,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T19:28:10.760701354Z","timeToStatus":3.746157,"timeTotal":3.746486,"timeWriteBody":0.000329} Apr 27 19:28:10 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-duri7xdwm0w19ta0uune","level":"info","msg":"request","remoteAddr":"127.0.0.1:54766","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=1000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T19:28:10.784075846Z"} Apr 27 19:28:14 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-duri7xdwm0w19ta0uune","level":"info","msg":"response","remoteAddr":"127.0.0.1:54766","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=1000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":1034240,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T19:28:14.562016992Z","timeToStatus":3.777498,"timeTotal":3.777929,"timeWriteBody":0.000431}
The corresponding arv-mount debug log for that period is attached as 20210427-arv-mount-debug.log
Updated by Ward Vandewege over 3 years ago
- Assigned To set to Ward Vandewege
- Status changed from New to In Progress
Updated by Ward Vandewege over 3 years ago
Updated by Peter Amstutz over 3 years ago
- Target version changed from 2021-04-28 bughunt sprint to 2021-05-12 sprint
Updated by Ward Vandewege over 3 years ago
Trying to track this down. On ce8i5 (where the bulk of the 400k+ collections are), I tried to use the experimental go fuse driver to list `home`.
This caused a api server rails process to misbehave in the group/contents endpoint: at ~100% of a core, it would consume more and more ram until the machine went OOM (or until I killed the process).
The rails log entry was
{"method":"GET","path":"/arvados/v1/groups/contents","format":"html","controller":"Arvados::V1::GroupsController","action":"contents","status":0,"duration":426683.8,"view":0.0,"db":13347.96,"request_id":"req-1ec6zkrd8szj47hjr6k2","client_ipaddr":"127.0.0.1","client_auth":"ce8i5-gj3su-a8lfaeci7rnrfxi","params":{"count":"none","filters":"[[\"uuid\",\"is_a\",\"arvados#collection\"]]","include":"","offset":"0","order":"[\"uuid\"]","uuid":"ce8i5-tpzed-xo2k4i24bjzwedw"},"@timestamp":"2021-05-04T13:11:51.441170640Z","@version":"1","message":"[0] GET /arvados/v1/groups/contents (Arvados::V1::GroupsController#contents)"}
and the corresponding controller entry:
May 04 13:09:44 api.ce8i5.arvadosapi.com arvados-controller[48567]: {"PID":48567,"RequestID":"req-1ec6zkrd8szj47hjr6k2","level":"info","msg":"response","remoteAddr":"127.0.0.1:47330","reqBytes":0,"reqForwardedFor":"52.177.187.26","reqHost":"ce8i5.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/ce8i5-tpzed-xo2k4i24bjzwedw/contents","reqQuery":"count=none\u0026filters=%5B%5B%22uuid%22%2C%22is_a%22%2C%22arvados%23collection%22%5D%5D\u0026order=uuid","respBody":"{\"errors\":[\"Get //railsapi.internalarvados/v1/groups/contents?count=none\\u0026filters=%5B%5B%22uuid%22%2C%22is_a%22%2C%22arvados%23collection%22%5D%5D\\u0026include=\\u0026offset=0\\u0026order=%5B%22uuid%22%5D\\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw: context deadline exceeded\"]}\n","respBytes":275,"respStatus":"Internal Server Error","respStatusCode":500,"time":"2021-05-04T13:09:44.747807296Z","timeToStatus":300.000900,"timeTotal":300.000940,"timeWriteBody":0.000040}
On the client, this caused:
ward@shell:~$ ls ackeep/home/ ls: reading directory 'ackeep/home/': Input/output error ward@shell:~$
In other words: ls didn't hang, and the `arvados-client mount` command didn't hang. Nice. The fuse driver process was still working fine, after this it responded to other requests normally. Very good.
Updated by Peter Amstutz over 3 years ago
- Blocks Idea #17512: Release Arvados 2.2 added
Updated by Peter Amstutz over 3 years ago
This seems wrong:
arv group contents -u ce8i5-tpzed-xo2k4i24bjzwedw -f '[["uuid","is_a","arvados#collection"]]' "items_available":212278, "kind":"arvados#objectList", "limit":100, "offset":0
arv collection list -f '[["owner_uuid", "=", "ce8i5-tpzed-xo2k4i24bjzwedw"]]' "items_available":210564, "kind":"arvados#collectionList", "limit":100, "offset":0
Updated by Peter Amstutz over 3 years ago
Default sorting is different. Hm.
$ arv --format=uuid collection list -f '[["owner_uuid", "=", "ce8i5-tpzed-xo2k4i24bjzwedw"]]' -l 10 tordo-4zz18-imj9n6fr54r0lmn tordo-4zz18-7ww56ff272db5ic tordo-4zz18-2yg8wum4runsoi7 tordo-4zz18-76xymxrcvegoz87 tordo-4zz18-0uankid7xv4wwem tordo-4zz18-sjfzx7pwqfj8jfq tordo-4zz18-6yr13gnvrpapo2s tordo-4zz18-sw0a55lm3qc009k tordo-4zz18-fk1k7siayjg9h5a tordo-4zz18-gzo062wzhhy8igf $ arv --format=uuid group contents -u ce8i5-tpzed-xo2k4i24bjzwedw -f '[["uuid","is_a","arvados#collection"]]' -l 10 tordo-4zz18-dale5t56h78135h tordo-4zz18-swrtkbdkctrk4af tordo-4zz18-ubfy3xu734t7ilc tordo-4zz18-imj9n6fr54r0lmn tordo-4zz18-7ww56ff272db5ic tordo-4zz18-2yg8wum4runsoi7 tordo-4zz18-76xymxrcvegoz87 tordo-4zz18-0uankid7xv4wwem tordo-4zz18-sjfzx7pwqfj8jfq tordo-4zz18-6yr13gnvrpapo2s
Updated by Peter Amstutz over 3 years ago
Nope I made the exact same mistake as you, it is --uuid
not -u
But still getting an unexpected result:
$ arv group contents --uuid ce8i5-tpzed-xo2k4i24bjzwedw -f '[["uuid","is_a","arvados#collection"]]' "items_available":421922, $ arv collection list -f '[["owner_uuid", "=", "ce8i5-tpzed-xo2k4i24bjzwedw"]]' "items_available":210776,
Updated by Peter Amstutz over 3 years ago
For me, using Workbench 2 to get the home project contents also fails.
It took me much longer to figure this out than it should, but what seems to be happening is that we're getting "Oops! Federation!"
$ echo $ARVADOS_API_HOST tordo.arvadosapi.com $ arv --format=uuid group contents --uuid ce8i5-tpzed-xo2k4i24bjzwedw -f '[["uuid","is_a","arvados#collection"]]' ce8i5-4zz18-z7cuqcr29f2nwv0 ce8i5-4zz18-d1fn4x60badlxgs ce8i5-4zz18-f1z6oh1aztmqqia ce8i5-4zz18-mhae76913cys0kr ...
I don't know exactly what controller is doing but presumably it is looking at the ce8i5 user uuid prefix and redirecting the request to ce8i5.
The original problem, where the listing never finishes, could possibly be due to items being added between requests, so the "items_available" target keeps rising.
Possibly useful, I added "arvados.util.keyset_list_all" which uses a different paging strategy than "list_all", which is much more gentle on the database, so large listings may complete faster.
With the "keyset_list_all" approach, you can use decending order by time, so it doesn't get stuck or mess up when things are added while it is trying to list everything.
Updated by Peter Amstutz over 3 years ago
func (conn *Conn) GroupContents(ctx context.Context, options arvados.GroupContentsOptions) (arvados.ObjectList, error) { return conn.chooseBackend(options.UUID).GroupContents(ctx, options) }
This (a) needs to special case federated users and default to local and (b) needs to accept "cluster_id" for the case where you actually do want to get the home contents of a user on a different cluster.
Updated by Peter Amstutz over 3 years ago
- Assigned To changed from Ward Vandewege to Peter Amstutz
Updated by Peter Amstutz over 3 years ago
17589-group-contents @ 19e9af391ebd6766200bc76c9d1ee90f167e43a7
Fix the group contents endpoint so that requests for user home directory go to local. Add test.
17589-fuse-use-keyset-paging @ fc0e9f360dd51c5c508426846f2caed9f1c2ec66 (includes the previous branch)
Switch fusedir.py to use keyset_list_all() which uses a more efficient and stable paging mechanism than list_all().
Updated by Peter Amstutz over 3 years ago
It seems that there's a problem in the underlying Ruby code for group contents. When we try to do a query that uses count=none and keyset paging (Go FUSE, and now arv-mount FUSE), the API server apparently OOMs and needs to be rebooted.
Updated by Peter Amstutz over 3 years ago
- Today I learned that group contents has an undocumented parameter called "last_object_class" which seems to provide a workaround to the paging problem where results are ordered within each object type (table) but not sorted across all results.
- even on arvbox, there is a dramatic request time difference between "group contents --count=exact" (150ms) and "group contents --count=none" (2500ms). This is very weird since "count=none" is intended to be an optimization.
Updated by Peter Amstutz over 3 years ago
- Target version changed from 2021-05-12 sprint to 2021-05-26 sprint
Updated by Peter Amstutz over 3 years ago
- Target version deleted (
2021-05-26 sprint)
# The call to object_list below will not populate :items_available in # its response, because count is disabled. Save @objects length (does # not require another db query) so that @offset (if set) is handled # correctly. countless_items_available = @objects.length
This comment is wrong. This does does require a db query. Calling @objects.length involves doing a query of the entire table, with all the joins, and no limit.
We don't want to do this, since it is functionally the same as the count=exact behavior. It is also completely unnecessary when offset=0.
On discussion with Tom, we decided that for count=none and nonzero offset, it can just fall back on the count=exact behavior.
17589-contents-count-none @ 9003633889b26bdd19c1376c69981aaaa21ee819
Updated by Peter Amstutz over 3 years ago
- Target version set to 2021-05-26 sprint
Updated by Ward Vandewege over 3 years ago
Peter Amstutz wrote:
[...]
This comment is wrong. This does does require a db query. Calling @objects.length involves doing a query of the entire table, with all the joins, and no limit.
Yeah, clearly I made a mistake, it's odd, I recall testing this and looking at the database queries.
We don't want to do this, since it is functionally the same as the count=exact behavior. It is also completely unnecessary when offset=0.
On discussion with Tom, we decided that for count=none and nonzero offset, it can just fall back on the count=exact behavior.
Yeah, that seems right.
17589-contents-count-none @ 9003633889b26bdd19c1376c69981aaaa21ee819
LGTM. This seems to include the 17589-group-contents and 17589-fuse-use-keyset-paging branches from https://dev.arvados.org/issues/17589#note-16 so those are good to merge too.
Updated by Peter Amstutz over 3 years ago
- Status changed from In Progress to Resolved
Updated by Peter Amstutz over 3 years ago
- Status changed from Resolved to Feedback
Did some testing after deployment to ce8i5.
I was able to list Ward's home project without crashing anything, but it took 58 minutes!
$ time ls | wc
426295 2094831 24104847
real 58m18.061s
user 0m5.189s
sys 0m0.163s
That means it had to do 427 API requests. Which is quite a lot, but something else is going on here. For some reason, even with count=none, it is very expensive to fetch each page:
{"PID":3703,"RequestID":"req-6zpplgutxcdq5cevhwj5","level":"info","msg":"response","remoteAddr":"127.0.0.1:49712","reqBytes":0,"reqForwardedFor":"67.189.170.63","reqHost":"ce8i5.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22%3E%3D%22%2C+%22ce8i5-4zz18-cma8av9gjz7orlg%22%5D%2C+%5B%22uuid%22%2C+%22%21%3D%22%2C+%22ce8i5-4zz18-cma8av9gjz7orlg%22%5D%2C+%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=1000\u0026count=none\u0026order=%5B%22uuid+asc%22%2C+%22uuid+asc%22%5D\u0026alt=json","respBytes":977268,"respStatus":"OK","respStatusCode":200,"time":"2021-05-13T17:25:49.144549510Z","timeToStatus":5.606698,"timeTotal":5.616120,"timeWriteBody":0.009422}
Need to get to the bottom of this.
Updated by Peter Amstutz over 3 years ago
$ time arv group contents --filters 'ce8i5-4zz18-cma8av9gjz7orlg%22%5D_%5B%22uuid%22_%22%3E=%22_%22ce8i5-4zz18-cma8av9gjz7orlg%22%5D_%5B%22uuid%22_%22is_a%22_%22arvados#collection">"uuid", "!=", "ce8i5-4zz18-cma8av9gjz7orlg"], ["uuid", ">=", "ce8i5-4zz18-cma8av9gjz7orlg"], ["uuid", "is_a", "arvados' --order "uuid asc, uuid asc" --count=none --limit 100
real 0m1.256s
user 0m0.240s
sys 0m0.017s
$ time arv group contents --filters 'ce8i5-4zz18-cma8av9gjz7orlg%22%5D_%5B%22uuid%22_%22%3E=%22_%22ce8i5-4zz18-cma8av9gjz7orlg%22%5D_%5B%22uuid%22_%22is_a%22_%22arvados#collection">"uuid", "!=", "ce8i5-4zz18-cma8av9gjz7orlg"], ["uuid", ">=", "ce8i5-4zz18-cma8av9gjz7orlg"], ["uuid", "is_a", "arvados' --order "uuid asc, uuid asc" --count=none --limit 1000
real 0m10.880s
user 0m0.300s
sys 0m0.069s
Using count=none is an improvement over count=exact:
$ time arv group contents --filters 'ce8i5-4zz18-cma8av9gjz7orlg%22%5D_%5B%22uuid%22_%22%3E=%22_%22ce8i5-4zz18-cma8av9gjz7orlg%22%5D_%5B%22uuid%22_%22is_a%22_%22arvados#collection">"uuid", "!=", "ce8i5-4zz18-cma8av9gjz7orlg"], ["uuid", ">=", "ce8i5-4zz18-cma8av9gjz7orlg"], ["uuid", "is_a", "arvados' --order "uuid asc, uuid asc" --count=exact --limit 100
real 0m2.862s
user 0m0.244s
sys 0m0.028s
Updated by Peter Amstutz over 3 years ago
I don't know what exactly it is doing, but retrieving 1000 rows takes 57ms of database time, but the whole request takes 2695ms because of whatever it is doing in Ruby:
{"method":"GET","path":"/arvados/v1/groups/contents","format":"html","controller":"Arvados::V1::GroupsController","action":"contents","status":200,"duration":2695.01,"v
iew":0.39,"db":56.99,"request_id":"req-da7oqupeflmrlipsjsp2","client_ipaddr":"127.0.0.1","client_auth":"ce8i5-gj3su-b8imt9ab31yacmz","params":{"cluster_id":"","count":"
none","filters":"ce8i5-4zz18-zschyoep2a1em49%5C%22%5D%5B%5C%22uuid%5C%22%5C%22!=%5C%22%5C%22ce8i5-4zz18-zschyoep2a1em49%5C%22%5D%5B%5C%22uuid%5C%22%5C%22is_a%5C%22%5C%22arvados#collection">\"uuid\",\"\\u003e=\",\"ce8i5-4zz18-zschyoep2a1em49\"],[\"uuid\",\"!=\",\"ce8i5-4zz18-zschyoep2a1em49\"],[\"uuid\",\"is_a\",\"arvados"
,"include":"","limit":"1000","offset":"0","order":"[\"uuid asc\",\"uuid asc\"]","uuid":"ce8i5-tpzed-xo2k4i24bjzwedw"},"@timestamp":"2021-05-13T17:49:38.262840464Z","@ve
rsion":"1","message":"[200] GET /arvados/v1/groups/contents (Arvados::V1::GroupsController#contents)"}
Updated by Peter Amstutz over 3 years ago
Peter Amstutz wrote:
I don't know what exactly it is doing, but retrieving 1000 rows takes 57ms of database time, but the whole request takes 2695ms because of whatever it is doing in Ruby:
{"method":"GET","path":"/arvados/v1/groups/contents","format":"html","controller":"Arvados::V1::GroupsController","action":"contents","status":200,"duration":2695.01,"v
iew":0.39,"db":56.99,"request_id":"req-da7oqupeflmrlipsjsp2","client_ipaddr":"127.0.0.1","client_auth":"ce8i5-gj3su-b8imt9ab31yacmz","params":{"cluster_id":"","count":"
none","filters":"ce8i5-4zz18-zschyoep2a1em49%5C%22%5D%5B%5C%22uuid%5C%22%5C%22!=%5C%22%5C%22ce8i5-4zz18-zschyoep2a1em49%5C%22%5D%5B%5C%22uuid%5C%22%5C%22is_a%5C%22%5C%22arvados#collection">\"uuid\",\"\\u003e=\",\"ce8i5-4zz18-zschyoep2a1em49\"],[\"uuid\",\"!=\",\"ce8i5-4zz18-zschyoep2a1em49\"],[\"uuid\",\"is_a\",\"arvados"
,"include":"","limit":"1000","offset":"0","order":"[\"uuid asc\",\"uuid asc\"]","uuid":"ce8i5-tpzed-xo2k4i24bjzwedw"},"@timestamp":"2021-05-13T17:49:38.262840464Z","@ve
rsion":"1","message":"[200] GET /arvados/v1/groups/contents (Arvados::V1::GroupsController#contents)"}
I just realized this is probably a silly test case because the uuid starts with "zs..." and probably is close to the last one in the database.
I have moved this discussion over to https://dev.arvados.org/issues/17674
Updated by Peter Amstutz over 3 years ago
- Status changed from Feedback to Resolved