Bug #17589

[arv-mount] hangs on ls keep/home/, slowly increasing memory

Added by Ward Vandewege 16 days ago. Updated about 6 hours ago.

Status:
In Progress
Priority:
Normal
Assigned To:
Category:
FUSE
Target version:
Start date:
05/11/2021
Due date:
% Done:

0%

Estimated time:
(Total: 0.00 h)
Story points:
-

Description

This is happening on Tordo (AWS test cluster), at version 2.2.0~dev20210426141415-1. Tordo is in a LoginCluster federation with ce8i5 (Azure test cluster) as the login cluster.

As a superuser, on the tordo shell node, I mounted keep with `arv-mount keep --debug --foreground`. I then ran `ls keep/home`. Workbench2 says that there are 52074 items in my home project.

The `ls` command hangs indefinitely, with `arv-mount` doing seemingly nothing. The output from arv-mount is just this:

unique: 10, opcode: LOOKUP (1), nodeid: 1, insize: 45, pid: 29513
2021-04-27 01:35:09 arvados.arvados_fuse[29440] DEBUG: arv-mount lookup: parent_inode 1 name 'home' inode 5
   unique: 10, success, outsize: 144
unique: 11, opcode: LOOKUP (1), nodeid: 1, insize: 45, pid: 29513
2021-04-27 01:35:09 arvados.arvados_fuse[29440] DEBUG: arv-mount lookup: parent_inode 1 name 'home' inode 5
   unique: 11, success, outsize: 144
unique: 12, opcode: OPENDIR (27), nodeid: 5, insize: 48, pid: 29513
2021-04-27 01:35:09 arvados.arvados_fuse[29440] DEBUG: arv-mount opendir: inode 5
2021-04-27 01:35:09 arvados.arvados_fuse[29440] DEBUG: InodeCache touched inode 5 (size 0) (uuid ce8i5-tpzed-xo2k4i24bjzwedw) total now 0 (1 entries)
2021-04-27 01:35:09 arvados.arvados_fuse[29440] DEBUG: InodeCache touched inode 8 (size 0) (uuid ce8i5-tpzed-xo2k4i24bjzwedw) total now 0 (2 entries)

(no further output)

arv-mount slowly consumes more and more memory until the OOM killer takes it out.

While in this state, with `ls` hanging, accessing specific collections via `keep/by_id` works just fine.

20210427-arv-mount-debug.log (3.5 MB) 20210427-arv-mount-debug.log Ward Vandewege, 04/27/2021 07:40 PM

Subtasks

Task #17659: Review 17589-group-contentsIn ProgressWard Vandewege

Task #17661: Review 17589-fuse-use-keyset-paging In ProgressWard Vandewege

Task #17672: Review 17589-contents-count-noneIn ProgressWard Vandewege


Related issues

Blocks Arvados - Story #17512: Release Arvados 2.2New05/03/2021

History

#1 Updated by Ward Vandewege 16 days ago

  • Description updated (diff)

#2 Updated by Ward Vandewege 16 days ago

  • Category set to CWL
  • Description updated (diff)

#3 Updated by Ward Vandewege 16 days ago

  • Category changed from CWL to FUSE

#4 Updated by Ward Vandewege 16 days ago

The controller logs show a long list of group/contents requests, with an ever-increasing offset.

The manual query indicates that there should be 199695 items available.

ward@shell:~$ arv group contents -u ce8i5-tpzed-xo2k4i24bjzwedw -f '[["uuid","is_a","arvados#collection"]]'
...
 "items_available":199695,
 "kind":"arvados#objectList",
 "limit":100,
 "offset":0
}

But somehow arv-mount just keeps going, and it keeps getting data back:

...
Apr 27 01:51:30 tordo.arvadosapi.com arvados-controller[6632]: {"PID":6632,"RequestID":"req-wur4oz2c6j9gtne2oyzy","level":"info","msg":"request","remoteAddr":"127.0.0.1:60938","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=248000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T01:51:30.771006945Z"}
Apr 27 01:51:34 tordo.arvadosapi.com arvados-controller[6632]: {"PID":6632,"RequestID":"req-wur4oz2c6j9gtne2oyzy","level":"info","msg":"response","remoteAddr":"127.0.0.1:60938","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=248000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":979327,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T01:51:34.967312364Z","timeToStatus":4.196000,"timeTotal":4.196293,"timeWriteBody":0.000293}
Apr 27 01:51:34 tordo.arvadosapi.com arvados-controller[6632]: {"PID":6632,"RequestID":"req-ndhd2y6sbt70gf8rzr96","level":"info","msg":"request","remoteAddr":"127.0.0.1:60946","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=249000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T01:51:34.986794610Z"}
Apr 27 01:51:39 tordo.arvadosapi.com arvados-controller[6632]: {"PID":6632,"RequestID":"req-ndhd2y6sbt70gf8rzr96","level":"info","msg":"response","remoteAddr":"127.0.0.1:60946","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=249000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":978985,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T01:51:39.069045322Z","timeToStatus":4.081946,"timeTotal":4.082238,"timeWriteBody":0.000292}
...

Trying to reproduce this with the arv cli leads to a different request in the logs:

Apr 27 01:54:06 tordo.arvadosapi.com arvados-controller[6632]: {"PID":6632,"RequestID":"req-8l8zg5vzbsoqq0hhft2v","level":"info","msg":"request","remoteAddr":"127.0.0.1:32906","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"_profile=true\u0026bypass_federation=false\u0026cluster_id=ce8i5-tpzed-xo2k4i24bjzwedw\u0026cluster_id_given=true\u0026count=exact\u0026distinct=false\u0026filters=%5B%5B%22uuid%22%2C%22is_a%22%2C%22arvados%23collection%22%5D%5D\u0026filters_given=true\u0026help=false\u0026include=\u0026include_old_versions=false\u0026include_trash=false\u0026limit=9223372036854775807\u0026limit_given=true\u0026offset=249000\u0026offset_given=true\u0026order=\u0026recursive=false\u0026uuid=\u0026where=","time":"2021-04-27T01:54:06.414215153Z"}
Apr 27 01:54:06 tordo.arvadosapi.com arvados-controller[6632]: {"PID":6632,"RequestID":"req-8l8zg5vzbsoqq0hhft2v","level":"info","msg":"response","remoteAddr":"127.0.0.1:32906","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"_profile=true\u0026bypass_federation=false\u0026cluster_id=ce8i5-tpzed-xo2k4i24bjzwedw\u0026cluster_id_given=true\u0026count=exact\u0026distinct=false\u0026filters=%5B%5B%22uuid%22%2C%22is_a%22%2C%22arvados%23collection%22%5D%5D\u0026filters_given=true\u0026help=false\u0026include=\u0026include_old_versions=false\u0026include_trash=false\u0026limit=9223372036854775807\u0026limit_given=true\u0026offset=249000\u0026offset_given=true\u0026order=\u0026recursive=false\u0026uuid=\u0026where=","respBytes":109,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T01:54:06.918316347Z","timeToStatus":0.504083,"timeTotal":0.504092,"timeWriteBody":0.000009}

and also different results:

$ arv group contents -u ce8i5-tpzed-xo2k4i24bjzwedw -f '[["uuid","is_a","arvados#collection"]]' -s 249000 -l 9223372036854775807
{
 "included":[],
 "items":[],
 "items_available":199695,
 "kind":"arvados#objectList",
 "limit":1000,
 "offset":249000
}

This result is as expected; there is something about the arv-mount request that makes it misbehave.

Hmm. I recreated the arv mount request with curl, and that showed that the difference in request parameters results in a difference in items available:

curl -G -sH "Authorization: Bearer OMITTED" \
        https://tordo.arvadosapi.com/arvados/v1/groups/contents \
        --data-urlencode "offset=249000" \
        --data-urlencode "uuid=ce8i5-tpzed-xo2k4i24bjzwedw" \
        --data-urlencode 'filters=[["uuid","is_a","arvados#collection"]]' \
        --data-urlencode "limit=9223372036854775807" \
        --data-urlencode "alt=json" 

...
      "trash_at": null,
      "uuid": "ce8i5-4zz18-hkn8zfxr23dkw2l",
      "version": 1
    }
  ],
  "items_available": 410020,
  "kind": "arvados#objectList",
  "limit": 1000,
  "offset": 249000
}

I went back through the logs and noticed that my test from yesterday did stop after asking for the 410000 batch, and the response was small which lines up with it receiving the last 20 records:

Apr 27 02:32:33 tordo.arvadosapi.com arvados-controller[6632]: {"PID":6632,"RequestID":"req-4bump559x19zfo46cv2i","level":"info","msg":"request","remoteAddr":"127.0.0.1:37416","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=410000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T02:32:33.370600217Z"}
Apr 27 02:32:35 tordo.arvadosapi.com arvados-controller[6632]: {"PID":6632,"RequestID":"req-4bump559x19zfo46cv2i","level":"info","msg":"response","remoteAddr":"127.0.0.1:37416","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=410000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":18864,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T02:32:35.591036450Z","timeToStatus":2.220103,"timeTotal":2.220430,"timeWriteBody":0.000327}

So... maybe the groups endpoint is working as expected. Two questions remain:

a) Why is arv-mount doing a request that returns 410,020 items available, while the roughly equivalent version with `arv` returns 199,695 items?

b) why is arv-mount so terribly slow?

Tackling a)

Let's find out why the arv-mount call returns so many more results. Replicating it with a json request to mimic the `arv` command:

curl -G -sH "Authorization: Bearer OMITTED" \
    https://tordo.arvadosapi.com/arvados/v1/groups/contents \
    --data-urlencode "offset=249000" \
    --data-urlencode "uuid=ce8i5-tpzed-xo2k4i24bjzwedw" \
    --data-urlencode 'filters=[["uuid","is_a","arvados#collection"]]' \
    --data-urlencode "limit=9223372036854775807" \
    --data-urlencode "include_trash=false" \
    --data-urlencode "cluster_id=ce8i5-tpzed-xo2k4i24bjzwedw" \
    --data-urlencode "distinct=false" \
    --data-urlencode "include_old_versions=false" \
    --data-urlencode "recursive=false" \
    --data-urlencode "alt=json" 
Apr 27 13:20:17 tordo.arvadosapi.com arvados-controller[6632]: {"PID":6632,"RequestID":"req-18suuu9bl5ndjmmd55v9","level":"info","msg":"request","remoteAddr":"127.0.0.1:38400","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=249000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C%22is_a%22%2C%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026include_trash=false\u0026cluster_id=ce8i5-tpzed-xo2k4i24bjzwedw\u0026distinct=false\u0026include_old_versions=false\u0026recursive=false\u0026alt=json","time":"2021-04-27T13:20:17.329271974Z"}
Apr 27 13:20:19 tordo.arvadosapi.com arvados-controller[6632]: {"PID":6632,"RequestID":"req-18suuu9bl5ndjmmd55v9","level":"info","msg":"response","remoteAddr":"127.0.0.1:38400","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=249000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C%22is_a%22%2C%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026include_trash=false\u0026cluster_id=ce8i5-tpzed-xo2k4i24bjzwedw\u0026distinct=false\u0026include_old_versions=false\u0026recursive=false\u0026alt=json","respBytes":978985,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T13:20:19.839597233Z","timeToStatus":2.509989,"timeTotal":2.510312,"timeWriteBody":0.000323}

But that still yields 410,020 items available:

...
      "trash_at": null,
      "uuid": "ce8i5-4zz18-hkn8zfxr23dkw2l",
      "version": 1
    }
  ],
  "items_available": 410020,
  "kind": "arvados#objectList",
  "limit": 1000,
  "offset": 249000
}

Ah. The difference is that `arv_mount` sends a `uuid` parameter with the uuid of home project, whereas `arv group contents` sends `cluster_id` with that uuid value, and sends an empty `uuid` field:

curl -G -sH "Authorization: Bearer OMITTED" \
    https://tordo.arvadosapi.com/arvados/v1/groups/contents \
    --data-urlencode "offset=249000" \
    --data-urlencode "cluster_id=ce8i5-tpzed-xo2k4i24bjzwedw" \
    --data-urlencode 'filters=[["uuid","is_a","arvados#collection"]]' \
    --data-urlencode "limit=9223372036854775807" \
    --data-urlencode "include_trash=false" \
    --data-urlencode "cluster_id=ce8i5-tpzed-xo2k4i24bjzwedw" \
    --data-urlencode "distinct=false" \
    --data-urlencode "include_old_versions=false" \
    --data-urlencode "recursive=false" \
    --data-urlencode "alt=json" 
...
{
  "included": [],
  "items": [],
  "items_available": 199695,
  "kind": "arvados#objectList",
  "limit": 1000,
  "offset": 249000
}

Hrm, it looks like our gems on tordo are out of date:

ward@shell:~$ gem list |grep arv
arvados (2.0.4, 1.3.3.20190320201707)
arvados-cli (2.0.4)
arvados-google-api-client (0.8.7.4)
arvados-login-sync (2.0.4)

I fixed that:

shell.tordo:~# gem list|grep arv
arvados (2.1.2, 1.3.3.20190320201707)
arvados-cli (2.1.2)
arvados-google-api-client (0.8.7.4)
arvados-login-sync (2.1.2)

(the arvados gem version 1.3.3.20190320201707 is installed because of an outdated dependency in the arvados-login-sync gem, I just pushed a fix for that in 24b0875964b3eff98c12d1c135d8797efcfabfb2).

The updated version of `arv` doesn't make a difference though:

ward@shell:~$ arv group contents -u ce8i5-tpzed-xo2k4i24bjzwedw -f '[["uuid","is_a","arvados#collection"]]' |tail
   "trash_at":null,
   "uuid":"tordo-4zz18-13ok9k86ne7rpoo",
   "version":1
  }
 ],
 "items_available":199695,
 "kind":"arvados#objectList",
 "limit":100,
 "offset":0
}

Ah, well, duh, that was totally PEBKAC. The cli arguments for `arv` are not very intuitive. `-u` is for cluster_id and `--uuid` is what I want:

ward@shell:~$ arv group contents --uuid ce8i5-tpzed-xo2k4i24bjzwedw -f '[["uuid","is_a","arvados#collection"]]' |tail
   "trash_at":null,
   "uuid":"ce8i5-4zz18-qfyy9hyriqbwobm",
   "version":1
  }
 ],
 "items_available":410020,
 "kind":"arvados#objectList",
 "limit":100,
 "offset":0
}

Conclusion for a) is that the "approximate" equivalent `arv group contents` command was not equivalent enough. `arv-mount` is doing the right thing.


Tackling b) "why is arv-mount so slow"

It's getting 410,020 items to show. It's requesting 1,000 at a time. It's doing that serially. The controller logs show that the slowness is actually in getting the api server response:

Apr 27 18:04:21 tordo.arvadosapi.com arvados-controller[6632]: {"PID":6632,"RequestID":"req-zwrn6a8vovar784akvmp","level":"info","msg":"request","remoteAddr":"127.0.0.1:45342","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=115000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T18:04:21.061118586Z"}
Apr 27 18:04:27 tordo.arvadosapi.com arvados-controller[6632]: {"PID":6632,"RequestID":"req-zwrn6a8vovar784akvmp","level":"info","msg":"response","remoteAddr":"127.0.0.1:45342","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=115000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":978468,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T18:04:27.404367871Z","timeToStatus":6.342963,"timeTotal":6.343237,"timeWriteBody":0.000274}
Apr 27 18:04:27 tordo.arvadosapi.com arvados-controller[6632]: {"PID":6632,"RequestID":"req-gi4vdr6jjqw2posxa5sc","level":"info","msg":"request","remoteAddr":"127.0.0.1:45656","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=116000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T18:04:27.425531548Z"}
Apr 27 18:04:33 tordo.arvadosapi.com arvados-controller[6632]: {"PID":6632,"RequestID":"req-gi4vdr6jjqw2posxa5sc","level":"info","msg":"response","remoteAddr":"127.0.0.1:45656","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=116000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":979259,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T18:04:33.496201723Z","timeToStatus":6.070386,"timeTotal":6.070660,"timeWriteBody":0.000274}

The delay between request and response for the request with offset 115,000 is 6 seconds. The request for offset 116,000 follows in less than a second.

Hmm. It seems to get the list more than once?

Apr 27 18:22:35 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-ro0y1t0a4iaxgpqsaoba","level":"info","msg":"request","remoteAddr":"127.0.0.1:56958","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=409000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T18:22:35.305997716Z"}
Apr 27 18:22:39 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-ro0y1t0a4iaxgpqsaoba","level":"info","msg":"response","remoteAddr":"127.0.0.1:56958","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=409000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":951213,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T18:22:39.410794483Z","timeToStatus":4.101220,"timeTotal":4.104786,"timeWriteBody":0.003566}
Apr 27 18:22:39 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-1oug9ifewazn0k684q0g","level":"info","msg":"request","remoteAddr":"127.0.0.1:56990","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=410000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T18:22:39.425739879Z"}
Apr 27 18:22:41 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-1oug9ifewazn0k684q0g","level":"info","msg":"response","remoteAddr":"127.0.0.1:56990","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=410000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":293739,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T18:22:41.269223334Z","timeToStatus":1.843310,"timeTotal":1.843468,"timeWriteBody":0.000158}
Apr 27 18:23:29 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-8k0i9slizvrbrmw653uv","level":"info","msg":"request","remoteAddr":"127.0.0.1:57458","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/users/ce8i5-tpzed-xo2k4i24bjzwedw","reqQuery":"alt=json","time":"2021-04-27T18:23:29.707199973Z"}
Apr 27 18:23:29 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-8k0i9slizvrbrmw653uv","level":"info","msg":"response","remoteAddr":"127.0.0.1:57458","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/users/ce8i5-tpzed-xo2k4i24bjzwedw","reqQuery":"alt=json","respBytes":617,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T18:23:29.772947870Z","timeToStatus":0.065729,"timeTotal":0.065739,"timeWriteBody":0.000010}
Apr 27 18:23:29 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-w2f9j09wxy5tikzsqzeh","level":"info","msg":"request","remoteAddr":"127.0.0.1:57460","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=0\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23group%22%5D%2C+%5B%22groups.group_class%22%2C+%22in%22%2C+%5B%22project%22%2C+%22filter%22%5D%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T18:23:29.776418908Z"}
Apr 27 18:23:29 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-w2f9j09wxy5tikzsqzeh","level":"info","msg":"response","remoteAddr":"127.0.0.1:57460","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=0\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23group%22%5D%2C+%5B%22groups.group_class%22%2C+%22in%22%2C+%5B%22project%22%2C+%22filter%22%5D%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":2619,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T18:23:29.852613390Z","timeToStatus":0.076160,"timeTotal":0.076186,"timeWriteBody":0.000026}
Apr 27 18:23:29 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-n20y8tcbswoteyp5vl31","level":"info","msg":"request","remoteAddr":"127.0.0.1:57462","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=0\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T18:23:29.855664034Z"}
Apr 27 18:23:34 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-n20y8tcbswoteyp5vl31","level":"info","msg":"response","remoteAddr":"127.0.0.1:57462","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=0\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":1035365,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T18:23:34.883242160Z","timeToStatus":5.027091,"timeTotal":5.027561,"timeWriteBody":0.000470}
Apr 27 18:23:34 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-z5v3as9fh354p9e0h44m","level":"info","msg":"request","remoteAddr":"127.0.0.1:57540","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=1000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T18:23:34.901086368Z"}
Apr 27 18:23:38 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-z5v3as9fh354p9e0h44m","level":"info","msg":"response","remoteAddr":"127.0.0.1:57540","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=1000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":1034702,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T18:23:38.490090750Z","timeToStatus":3.588468,"timeTotal":3.588967,"timeWriteBody":0.000499}
Apr 27 18:23:38 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-9v2sx9irye3kt73lawp2","level":"info","msg":"request","remoteAddr":"127.0.0.1:57548","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=2000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T18:23:38.506781205Z"}
Apr 27 18:23:42 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-9v2sx9irye3kt73lawp2","level":"info","msg":"response","remoteAddr":"127.0.0.1:57548","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=2000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":1034106,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T18:23:42.873203307Z","timeToStatus":4.363532,"timeTotal":4.366400,"timeWriteBody":0.002869}

Wait, and again?

Apr 27 18:56:28 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-5ghnybu6c2338w40vp76","level":"info","msg":"request","remoteAddr":"127.0.0.1:40960","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=410000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T18:56:28.591156111Z"}
Apr 27 18:56:32 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-5ghnybu6c2338w40vp76","level":"info","msg":"response","remoteAddr":"127.0.0.1:40960","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=410000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":508538,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T18:56:32.610668168Z","timeToStatus":4.019278,"timeTotal":4.019497,"timeWriteBody":0.000219}
Apr 27 18:56:32 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-ls908p4sveqsmhbkvzsi","level":"info","msg":"request","remoteAddr":"127.0.0.1:41004","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=410537\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T18:56:32.624584396Z"}
Apr 27 18:56:35 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-ls908p4sveqsmhbkvzsi","level":"info","msg":"response","remoteAddr":"127.0.0.1:41004","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=410537\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":1039,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T18:56:35.356290294Z","timeToStatus":2.731689,"timeTotal":2.731699,"timeWriteBody":0.000010}
Apr 27 18:56:44 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-cb3u4lfpxn1szrj16adt","level":"info","msg":"request","remoteAddr":"127.0.0.1:41156","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/users/ce8i5-tpzed-xo2k4i24bjzwedw","reqQuery":"alt=json","time":"2021-04-27T18:56:44.479388280Z"}
Apr 27 18:56:44 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-cb3u4lfpxn1szrj16adt","level":"info","msg":"response","remoteAddr":"127.0.0.1:41156","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/users/ce8i5-tpzed-xo2k4i24bjzwedw","reqQuery":"alt=json","respBytes":617,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T18:56:44.534287931Z","timeToStatus":0.054884,"timeTotal":0.054893,"timeWriteBody":0.000009}
Apr 27 18:56:44 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-97hz7hlguiighxkuo707","level":"info","msg":"request","remoteAddr":"127.0.0.1:41158","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=0\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23group%22%5D%2C+%5B%22groups.group_class%22%2C+%22in%22%2C+%5B%22project%22%2C+%22filter%22%5D%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T18:56:44.537669320Z"}
Apr 27 18:56:44 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-97hz7hlguiighxkuo707","level":"info","msg":"response","remoteAddr":"127.0.0.1:41158","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=0\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23group%22%5D%2C+%5B%22groups.group_class%22%2C+%22in%22%2C+%5B%22project%22%2C+%22filter%22%5D%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":2619,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T18:56:44.595933349Z","timeToStatus":0.058234,"timeTotal":0.058256,"timeWriteBody":0.000023}
Apr 27 18:56:44 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-3b5s0uyy13qh8898c5kd","level":"info","msg":"request","remoteAddr":"127.0.0.1:41160","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=0\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T18:56:44.598972874Z"}
Apr 27 18:56:48 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-3b5s0uyy13qh8898c5kd","level":"info","msg":"response","remoteAddr":"127.0.0.1:41160","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=0\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":1034857,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T18:56:48.424530662Z","timeToStatus":3.825031,"timeTotal":3.825537,"timeWriteBody":0.000506}
Apr 27 18:56:48 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-z8b95n1s1hk0vvkzeyg5","level":"info","msg":"request","remoteAddr":"127.0.0.1:41162","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=1000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T18:56:48.438946395Z"}
Apr 27 18:56:52 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-z8b95n1s1hk0vvkzeyg5","level":"info","msg":"response","remoteAddr":"127.0.0.1:41162","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=1000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":1034602,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T18:56:52.048478363Z","timeToStatus":3.609040,"timeTotal":3.609521,"timeWriteBody":0.000481}

So there is really a bug here.

Again:

Apr 27 19:27:47 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-xli9lxyliy5spevpeyl8","level":"info","msg":"request","remoteAddr":"127.0.0.1:54620","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=409000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T19:27:47.240876489Z"}
Apr 27 19:27:52 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-xli9lxyliy5spevpeyl8","level":"info","msg":"response","remoteAddr":"127.0.0.1:54620","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=409000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":952550,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T19:27:52.033331703Z","timeToStatus":4.792080,"timeTotal":4.792440,"timeWriteBody":0.000360}
Apr 27 19:27:52 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-6hiwi1qyvfh34wlvmymp","level":"info","msg":"request","remoteAddr":"127.0.0.1:54640","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=410000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T19:27:52.063428278Z"}
Apr 27 19:27:55 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-6hiwi1qyvfh34wlvmymp","level":"info","msg":"response","remoteAddr":"127.0.0.1:54640","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=410000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":641514,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T19:27:55.974670620Z","timeToStatus":3.910995,"timeTotal":3.911229,"timeWriteBody":0.000234}
Apr 27 19:28:02 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-1j7nbd1ewfqhlgv2iovi","level":"info","msg":"request","remoteAddr":"127.0.0.1:54694","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/virtual_machines/tordo-2x53u-80c95rykdjur7t2/logins","reqQuery":"","time":"2021-04-27T19:28:02.995816068Z"}
Apr 27 19:28:03 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-1j7nbd1ewfqhlgv2iovi","level":"info","msg":"response","remoteAddr":"127.0.0.1:54694","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/virtual_machines/tordo-2x53u-80c95rykdjur7t2/logins","reqQuery":"","respBytes":3607,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T19:28:03.018414151Z","timeToStatus":0.022430,"timeTotal":0.022587,"timeWriteBody":0.000157}
Apr 27 19:28:06 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-dc19hj8glmvuftplt4mw","level":"info","msg":"request","remoteAddr":"127.0.0.1:54734","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/users/ce8i5-tpzed-xo2k4i24bjzwedw","reqQuery":"alt=json","time":"2021-04-27T19:28:06.884991519Z"}
Apr 27 19:28:06 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-dc19hj8glmvuftplt4mw","level":"info","msg":"response","remoteAddr":"127.0.0.1:54734","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/users/ce8i5-tpzed-xo2k4i24bjzwedw","reqQuery":"alt=json","respBytes":617,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T19:28:06.949804361Z","timeToStatus":0.064797,"timeTotal":0.064806,"timeWriteBody":0.000010}
Apr 27 19:28:06 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-azo21kkotxok4g3vcz5y","level":"info","msg":"request","remoteAddr":"127.0.0.1:54736","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=0\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23group%22%5D%2C+%5B%22groups.group_class%22%2C+%22in%22%2C+%5B%22project%22%2C+%22filter%22%5D%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T19:28:06.953170049Z"}
Apr 27 19:28:07 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-azo21kkotxok4g3vcz5y","level":"info","msg":"response","remoteAddr":"127.0.0.1:54736","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=0\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23group%22%5D%2C+%5B%22groups.group_class%22%2C+%22in%22%2C+%5B%22project%22%2C+%22filter%22%5D%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":2619,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T19:28:07.011272128Z","timeToStatus":0.058071,"timeTotal":0.058094,"timeWriteBody":0.000023}
Apr 27 19:28:07 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-mdcvwa12r64r1zsuvg07","level":"info","msg":"request","remoteAddr":"127.0.0.1:54738","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=0\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T19:28:07.014198930Z"}
Apr 27 19:28:10 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-mdcvwa12r64r1zsuvg07","level":"info","msg":"response","remoteAddr":"127.0.0.1:54738","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=0\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":1034420,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T19:28:10.760701354Z","timeToStatus":3.746157,"timeTotal":3.746486,"timeWriteBody":0.000329}
Apr 27 19:28:10 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-duri7xdwm0w19ta0uune","level":"info","msg":"request","remoteAddr":"127.0.0.1:54766","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=1000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","time":"2021-04-27T19:28:10.784075846Z"}
Apr 27 19:28:14 tordo.arvadosapi.com arvados-controller[32292]: {"PID":32292,"RequestID":"req-duri7xdwm0w19ta0uune","level":"info","msg":"response","remoteAddr":"127.0.0.1:54766","reqBytes":0,"reqForwardedFor":"10.253.0.197","reqHost":"tordo.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/contents","reqQuery":"offset=1000\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw\u0026filters=%5B%5B%22uuid%22%2C+%22is_a%22%2C+%22arvados%23collection%22%5D%5D\u0026limit=9223372036854775807\u0026alt=json","respBytes":1034240,"respStatus":"OK","respStatusCode":200,"time":"2021-04-27T19:28:14.562016992Z","timeToStatus":3.777498,"timeTotal":3.777929,"timeWriteBody":0.000431}

The corresponding arv-mount debug log for that period is attached as 20210427-arv-mount-debug.log

#5 Updated by Ward Vandewege 16 days ago

  • Assigned To set to Ward Vandewege
  • Status changed from New to In Progress

#7 Updated by Peter Amstutz 15 days ago

  • Target version changed from 2021-04-28 bughunt sprint to 2021-05-12 sprint

#8 Updated by Ward Vandewege 9 days ago

Trying to track this down. On ce8i5 (where the bulk of the 400k+ collections are), I tried to use the experimental go fuse driver to list `home`.

This caused a api server rails process to misbehave in the group/contents endpoint: at ~100% of a core, it would consume more and more ram until the machine went OOM (or until I killed the process).

The rails log entry was

{"method":"GET","path":"/arvados/v1/groups/contents","format":"html","controller":"Arvados::V1::GroupsController","action":"contents","status":0,"duration":426683.8,"view":0.0,"db":13347.96,"request_id":"req-1ec6zkrd8szj47hjr6k2","client_ipaddr":"127.0.0.1","client_auth":"ce8i5-gj3su-a8lfaeci7rnrfxi","params":{"count":"none","filters":"[[\"uuid\",\"is_a\",\"arvados#collection\"]]","include":"","offset":"0","order":"[\"uuid\"]","uuid":"ce8i5-tpzed-xo2k4i24bjzwedw"},"@timestamp":"2021-05-04T13:11:51.441170640Z","@version":"1","message":"[0] GET /arvados/v1/groups/contents (Arvados::V1::GroupsController#contents)"}

and the corresponding controller entry:

May 04 13:09:44 api.ce8i5.arvadosapi.com arvados-controller[48567]: {"PID":48567,"RequestID":"req-1ec6zkrd8szj47hjr6k2","level":"info","msg":"response","remoteAddr":"127.0.0.1:47330","reqBytes":0,"reqForwardedFor":"52.177.187.26","reqHost":"ce8i5.arvadosapi.com","reqMethod":"GET","reqPath":"arvados/v1/groups/ce8i5-tpzed-xo2k4i24bjzwedw/contents","reqQuery":"count=none\u0026filters=%5B%5B%22uuid%22%2C%22is_a%22%2C%22arvados%23collection%22%5D%5D\u0026order=uuid","respBody":"{\"errors\":[\"Get //railsapi.internalarvados/v1/groups/contents?count=none\\u0026filters=%5B%5B%22uuid%22%2C%22is_a%22%2C%22arvados%23collection%22%5D%5D\\u0026include=\\u0026offset=0\\u0026order=%5B%22uuid%22%5D\\u0026uuid=ce8i5-tpzed-xo2k4i24bjzwedw: context deadline exceeded\"]}\n","respBytes":275,"respStatus":"Internal Server Error","respStatusCode":500,"time":"2021-05-04T13:09:44.747807296Z","timeToStatus":300.000900,"timeTotal":300.000940,"timeWriteBody":0.000040}

On the client, this caused:

ward@shell:~$ ls ackeep/home/
ls: reading directory 'ackeep/home/': Input/output error
ward@shell:~$

In other words: ls didn't hang, and the `arvados-client mount` command didn't hang. Nice. The fuse driver process was still working fine, after this it responded to other requests normally. Very good.

#9 Updated by Peter Amstutz 9 days ago

#10 Updated by Peter Amstutz 2 days ago

This seems wrong:

arv group contents -u ce8i5-tpzed-xo2k4i24bjzwedw -f '[["uuid","is_a","arvados#collection"]]'
 "items_available":212278,
 "kind":"arvados#objectList",
 "limit":100,
 "offset":0
arv collection list -f '[["owner_uuid", "=", "ce8i5-tpzed-xo2k4i24bjzwedw"]]'
 "items_available":210564,
 "kind":"arvados#collectionList",
 "limit":100,
 "offset":0

#11 Updated by Peter Amstutz 2 days ago

Default sorting is different. Hm.

$ arv --format=uuid collection list -f '[["owner_uuid", "=", "ce8i5-tpzed-xo2k4i24bjzwedw"]]' -l 10
tordo-4zz18-imj9n6fr54r0lmn
tordo-4zz18-7ww56ff272db5ic
tordo-4zz18-2yg8wum4runsoi7
tordo-4zz18-76xymxrcvegoz87
tordo-4zz18-0uankid7xv4wwem
tordo-4zz18-sjfzx7pwqfj8jfq
tordo-4zz18-6yr13gnvrpapo2s
tordo-4zz18-sw0a55lm3qc009k
tordo-4zz18-fk1k7siayjg9h5a
tordo-4zz18-gzo062wzhhy8igf

$ arv --format=uuid group contents -u ce8i5-tpzed-xo2k4i24bjzwedw -f '[["uuid","is_a","arvados#collection"]]' -l 10
tordo-4zz18-dale5t56h78135h
tordo-4zz18-swrtkbdkctrk4af
tordo-4zz18-ubfy3xu734t7ilc
tordo-4zz18-imj9n6fr54r0lmn
tordo-4zz18-7ww56ff272db5ic
tordo-4zz18-2yg8wum4runsoi7
tordo-4zz18-76xymxrcvegoz87
tordo-4zz18-0uankid7xv4wwem
tordo-4zz18-sjfzx7pwqfj8jfq
tordo-4zz18-6yr13gnvrpapo2s

#12 Updated by Peter Amstutz 2 days ago

Nope I made the exact same mistake as you, it is --uuid not -u

But still getting an unexpected result:

$ arv group contents --uuid ce8i5-tpzed-xo2k4i24bjzwedw -f '[["uuid","is_a","arvados#collection"]]'
 "items_available":421922,

$ arv collection list -f '[["owner_uuid", "=", "ce8i5-tpzed-xo2k4i24bjzwedw"]]'
 "items_available":210776,

#13 Updated by Peter Amstutz 2 days ago

For me, using Workbench 2 to get the home project contents also fails.

It took me much longer to figure this out than it should, but what seems to be happening is that we're getting "Oops! Federation!"

$ echo $ARVADOS_API_HOST
tordo.arvadosapi.com

$ arv --format=uuid group contents --uuid ce8i5-tpzed-xo2k4i24bjzwedw -f '[["uuid","is_a","arvados#collection"]]'
ce8i5-4zz18-z7cuqcr29f2nwv0
ce8i5-4zz18-d1fn4x60badlxgs
ce8i5-4zz18-f1z6oh1aztmqqia
ce8i5-4zz18-mhae76913cys0kr
...

I don't know exactly what controller is doing but presumably it is looking at the ce8i5 user uuid prefix and redirecting the request to ce8i5.

The original problem, where the listing never finishes, could possibly be due to items being added between requests, so the "items_available" target keeps rising.

Possibly useful, I added "arvados.util.keyset_list_all" which uses a different paging strategy than "list_all", which is much more gentle on the database, so large listings may complete faster.

With the "keyset_list_all" approach, you can use decending order by time, so it doesn't get stuck or mess up when things are added while it is trying to list everything.

#14 Updated by Peter Amstutz 1 day ago

func (conn *Conn) GroupContents(ctx context.Context, options arvados.GroupContentsOptions) (arvados.ObjectList, error) {
    return conn.chooseBackend(options.UUID).GroupContents(ctx, options)
}

This (a) needs to special case federated users and default to local and (b) needs to accept "cluster_id" for the case where you actually do want to get the home contents of a user on a different cluster.

#15 Updated by Peter Amstutz 1 day ago

  • Assigned To changed from Ward Vandewege to Peter Amstutz

#16 Updated by Peter Amstutz 1 day ago

17589-group-contents @ 19e9af391ebd6766200bc76c9d1ee90f167e43a7

Fix the group contents endpoint so that requests for user home directory go to local. Add test.

17589-fuse-use-keyset-paging @ fc0e9f360dd51c5c508426846f2caed9f1c2ec66 (includes the previous branch)

Switch fusedir.py to use keyset_list_all() which uses a more efficient and stable paging mechanism than list_all().

https://ci.arvados.org/view/Developer/job/developer-run-tests/2470/

#17 Updated by Peter Amstutz about 17 hours ago

It seems that there's a problem in the underlying Ruby code for group contents. When we try to do a query that uses count=none and keyset paging (Go FUSE, and now arv-mount FUSE), the API server apparently OOMs and needs to be rebooted.

#18 Updated by Peter Amstutz about 17 hours ago

  1. Today I learned that group contents has an undocumented parameter called "last_object_class" which seems to provide a workaround to the paging problem where results are ordered within each object type (table) but not sorted across all results.
  2. even on arvbox, there is a dramatic request time difference between "group contents --count=exact" (150ms) and "group contents --count=none" (2500ms). This is very weird since "count=none" is intended to be an optimization.

#19 Updated by Peter Amstutz about 15 hours ago

  • Target version changed from 2021-05-12 sprint to 2021-05-26 sprint

#20 Updated by Peter Amstutz about 8 hours ago

  • Target version deleted (2021-05-26 sprint)
        # The call to object_list below will not populate :items_available in
        # its response, because count is disabled.  Save @objects length (does
        # not require another db query) so that @offset (if set) is handled
        # correctly.
        countless_items_available = @objects.length

This comment is wrong. This does does require a db query. Calling @objects.length involves doing a query of the entire table, with all the joins, and no limit.

We don't want to do this, since it is functionally the same as the count=exact behavior. It is also completely unnecessary when offset=0.

On discussion with Tom, we decided that for count=none and nonzero offset, it can just fall back on the count=exact behavior.

17589-contents-count-none @ commit:9003633889b26bdd19c1376c69981aaaa21ee819

https://ci.arvados.org/view/Developer/job/developer-run-tests/2471/

#21 Updated by Peter Amstutz about 8 hours ago

  • Target version set to 2021-05-26 sprint

#22 Updated by Ward Vandewege about 6 hours ago

Peter Amstutz wrote:

[...]

This comment is wrong. This does does require a db query. Calling @objects.length involves doing a query of the entire table, with all the joins, and no limit.

Yeah, clearly I made a mistake, it's odd, I recall testing this and looking at the database queries.

We don't want to do this, since it is functionally the same as the count=exact behavior. It is also completely unnecessary when offset=0.

On discussion with Tom, we decided that for count=none and nonzero offset, it can just fall back on the count=exact behavior.

Yeah, that seems right.

17589-contents-count-none @ commit:9003633889b26bdd19c1376c69981aaaa21ee819

LGTM. This seems to include the 17589-group-contents and 17589-fuse-use-keyset-paging branches from https://dev.arvados.org/issues/17589#note-16 so those are good to merge too.

Also available in: Atom PDF