Bug #5662

[FUSE] cd/ls sometimes takes too long on su92l

Added by Abram Connelly over 4 years ago. Updated over 4 years ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
FUSE
Target version:
Start date:
04/03/2015
Due date:
% Done:

0%

Estimated time:
Story points:
-

Description

abram@abe.su92l:~/keepid$ time cd c1f9f6dd5f0721447abb805f940bbcb7+604162+K@ant

real    35m1.697s
user    0m0.016s
sys     0m0.000s

Related issues

Related to Arvados - Bug #5713: [FUSE] File access sometimes takes too long on su92lNew04/13/2015

Blocked by Arvados - Bug #5748: [Keep] sometimes, Keep sucks up 100% cpu and becomes really slow after a whileResolved05/06/2015

History

#1 Updated by Brett Smith over 4 years ago

Abram,

After our deployment last week, I'm able to run the same command from a fresh mount in ~5 seconds. Can you please try again and let me know if the issue persists for you? Thanks in advance.

#2 Updated by Brett Smith over 4 years ago

  • Subject changed from changing directory into keep mount sometimes takes a very long time to [FUSE] changing directory into keep mount sometimes takes a very long time
  • Category set to FUSE

For future readers, the manifest in question is not noticeably evil or anything. The most remarkable thing about it is that every locator in it has the hint +K@ant. Any time KeepClient tries to get a locator with that hint, it will waste time making a doomed request first, because keep.ant.arvadosapi.com doesn't exist. But since arv-mount only has to get the manifest text to handle the cd, I wouldn't expect that to cause this sort of pathological timing.

#3 Updated by Brett Smith over 4 years ago

  • Subject changed from [FUSE] changing directory into keep mount sometimes takes a very long time to [FUSE] cd/ls sometimes takes too long on su92l

#4 Updated by Tom Clegg over 4 years ago

This is still occurring today, and isn't explained by "fuse gets too big and starts swapping" (it happens when FUSE isn't swapping).

#5 Updated by Brett Smith over 4 years ago

  • Target version changed from Bug Triage to Arvados Future Sprints

Our current thinking is that #5748 is the main culprit here. We're very interested to hear how things look after the next deploy.

Also available in: Atom PDF