Bug #3644

[SDK] Fix arv-mount --tags and --groups modes to handle "home" project and #3036 collections changes.

Added by Peter Amstutz over 6 years ago. Updated over 6 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Start date:
08/28/2014
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
1.0

Description

See notes from Brett.

Possible hierarchy:

ls mnt/AllCollections/  all readable collections (listed by uuid)
ls mnt/Home/            collections and projects in current user's Home project
ls mnt/Shared/          all readable projects whose owner is not readable by the current user (just like the tree shown in workbench)

Subtasks

Task #3804: Review 3644-arv-mount-projectsResolvedPeter Amstutz

Task #3767: Get feedback from Abram and Sasha on proposed directory hierarchyResolvedPeter Amstutz

Task #3728: Handle "home" and name fields correctly.ResolvedPeter Amstutz


Related issues

Related to Arvados - Bug #3098: arv-mount refers to projects as "groups", doesn't mount the root group, doesn't display names of items correctly.Rejected07/18/2014

Associated revisions

Revision 50e48820
Added by Peter Amstutz over 6 years ago

Merge branch '3644-arv-mount-projects' closes #3644

History

#1 Updated by Tom Clegg over 6 years ago

  • Target version set to Arvados Future Sprints

#2 Updated by Tom Clegg over 6 years ago

  • Story points set to 1.0

#3 Updated by Ward Vandewege over 6 years ago

  • Subject changed from Fix arv-mount --tags and --groups modes to handle "home" project and #3036 collections changes. to [SDK] Fix arv-mount --tags and --groups modes to handle "home" project and #3036 collections changes.

#4 Updated by Ward Vandewege over 6 years ago

  • Target version changed from Arvados Future Sprints to 2014-09-17 sprint

#5 Updated by Peter Amstutz over 6 years ago

  • Assigned To set to Peter Amstutz

#6 Updated by Peter Amstutz over 6 years ago

  • Status changed from New to In Progress

#7 Updated by Tom Clegg over 6 years ago

  • Description updated (diff)

#8 Updated by Tim Pierce over 6 years ago

Reviewing at 9479599:

I want to clarify that I understand the thread safety concerns here. Because llfuse ensures that request handlers do not run concurrently, it's not necessary to explicitly protect the code against thread contention -- however, because calls to the API server might cause all llfuse clients to block unnecessarily, each arvados_fuse client explicitly releases the lock when it's waiting on a response from the API server (which is also why each thread gets its own API client).

Is that an accurate summary? The locking protocol was very puzzling until I read the arvados_fuse.Operations doc comment, which seemed to indicate that something like the above is what's happening.

services/fuse/arvados_fuse/__init__.py:
  • The Unix ctime field of an inode isn't the file's creation time, it's the time the metadata (i.e. the inode) was last changed. I think this should probably just be equal to the collection modification time.
  • CollectionDirectory
    • It looks as though a CollectionDirectory will never refresh itself once it's been instantiated: the update() method checks self.collection_object is not None and I don't see any place where this field gets reset to none. Is that intentional?
  • TagsDirectory, SharedDirectory
    • There's code here to invalidate the directory upon any event involving a Link, commented out. Is this unfinished business or code that needs to be removed?
  • FreshBase
    • I don't think it's necessary for FreshBase to provide ctime and mtime methods if they aren't used in the base class. In a way it would be better if they're not provided at all, so we have clear feedback if a derived class fails to implement the method and tries to call it anyway.

#9 Updated by Peter Amstutz over 6 years ago

Tim Pierce wrote:

Reviewing at 9479599:

I want to clarify that I understand the thread safety concerns here. Because llfuse ensures that request handlers do not run concurrently, it's not necessary to explicitly protect the code against thread contention -- however, because calls to the API server might cause all llfuse clients to block unnecessarily, each arvados_fuse client explicitly releases the lock when it's waiting on a response from the API server (which is also why each thread gets its own API client).

Is that an accurate summary? The locking protocol was very puzzling until I read the arvados_fuse.Operations doc comment, which seemed to indicate that something like the above is what's happening.

That's exactly right.

services/fuse/arvados_fuse/__init__.py:
  • The Unix ctime field of an inode isn't the file's creation time, it's the time the metadata (i.e. the inode) was last changed. I think this should probably just be equal to the collection modification time.

Man! That's annoying. Fixed so ctime == mtime. Also added atime().

  • CollectionDirectory
    • It looks as though a CollectionDirectory will never refresh itself once it's been instantiated: the update() method checks self.collection_object is not None and I don't see any place where this field gets reset to none. Is that intentional?

You're missing the second half of that conditional:

if self.collection_object is not None and portable_data_hash_pattern.match(self.collection_locator):
    return True

If the collection_locator is a portable data hash, it's guaranteed to never change its contents. If the collection locator is a uuid, it is possible that the manifest associated with the collection was changed by the user, so then in proceeds to checks to see if the portable_data_hash associated with the collection record has changed.

  • TagsDirectory, SharedDirectory
    • There's code here to invalidate the directory upon any event involving a Link, commented out. Is this unfinished business or code that needs to be removed?

I took this out. This was using websockets, but I was running into problems where it would just hang on connecting. I want to have a better change detection and invalidation scheme than "refresh if the object hasn't been refreshed in the last 60 seconds" but I think that's out of scope for this story.

  • FreshBase
    • I don't think it's necessary for FreshBase to provide ctime and mtime methods if they aren't used in the base class. In a way it would be better if they're not provided at all, so we have clear feedback if a derived class fails to implement the method and tries to call it anyway.

Removed mtime() and ctime() and added atime() (which is used by the base class.)

#10 Updated by Tim Pierce over 6 years ago

Peter Amstutz wrote:

Tim Pierce wrote:

Reviewing at 9479599:

I want to clarify that I understand the thread safety concerns here. Because llfuse ensures that request handlers do not run concurrently, it's not necessary to explicitly protect the code against thread contention -- however, because calls to the API server might cause all llfuse clients to block unnecessarily, each arvados_fuse client explicitly releases the lock when it's waiting on a response from the API server (which is also why each thread gets its own API client).

Is that an accurate summary? The locking protocol was very puzzling until I read the arvados_fuse.Operations doc comment, which seemed to indicate that something like the above is what's happening.

That's exactly right.

OK. It's clever. I don't see any red flags in the thread locking, but I don't think I understand why it's necessary for RecursiveInvalidateDirectory.invalidate to explicitly acquire and release the llfuse lock?

services/fuse/arvados_fuse/__init__.py:
  • The Unix ctime field of an inode isn't the file's creation time, it's the time the metadata (i.e. the inode) was last changed. I think this should probably just be equal to the collection modification time.

Man! That's annoying. Fixed so ctime == mtime. Also added atime().

Nice, thanks.

  • CollectionDirectory
    • It looks as though a CollectionDirectory will never refresh itself once it's been instantiated: the update() method checks self.collection_object is not None and I don't see any place where this field gets reset to none. Is that intentional?

You're missing the second half of that conditional:

[...]

If the collection_locator is a portable data hash, it's guaranteed to never change its contents. If the collection locator is a uuid, it is possible that the manifest associated with the collection was changed by the user, so then in proceeds to checks to see if the portable_data_hash associated with the collection record has changed.

Ah, of course! Thanks for clarifying.

  • TagsDirectory, SharedDirectory
    • There's code here to invalidate the directory upon any event involving a Link, commented out. Is this unfinished business or code that needs to be removed?

I took this out. This was using websockets, but I was running into problems where it would just hang on connecting. I want to have a better change detection and invalidation scheme than "refresh if the object hasn't been refreshed in the last 60 seconds" but I think that's out of scope for this story.

Makes sense. Do we have a story for improving arv-mount/arvados_fuse invalidation logic?

#11 Updated by Peter Amstutz over 6 years ago

Tim Pierce wrote:

OK. It's clever. I don't see any red flags in the thread locking, but I don't think I understand why it's necessary for RecursiveInvalidateDirectory.invalidate to explicitly acquire and release the llfuse lock?

Because that is intended to be called from a thread separate from llfuse which is listening on websockets.

Makes sense. Do we have a story for improving arv-mount/arvados_fuse invalidation logic?

#3833

Fuse tests should pass now.

#12 Updated by Anonymous over 6 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 67 to 100

Applied in changeset arvados|commit:50e48820312868513b6efcafa84c6d825ac51cd4.

Also available in: Atom PDF