Project

General

Profile

Actions

Bug #3644

closed

[SDK] Fix arv-mount --tags and --groups modes to handle "home" project and #3036 collections changes.

Added by Peter Amstutz over 9 years ago. Updated over 9 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Story points:
1.0

Description

See notes from Brett.

Possible hierarchy:

ls mnt/AllCollections/  all readable collections (listed by uuid)
ls mnt/Home/            collections and projects in current user's Home project
ls mnt/Shared/          all readable projects whose owner is not readable by the current user (just like the tree shown in workbench)

Subtasks 3 (0 open3 closed)

Task #3804: Review 3644-arv-mount-projectsResolvedPeter Amstutz08/28/2014Actions
Task #3767: Get feedback from Abram and Sasha on proposed directory hierarchyResolvedPeter Amstutz08/28/2014Actions
Task #3728: Handle "home" and name fields correctly.ResolvedPeter Amstutz08/28/2014Actions

Related issues

Related to Arvados - Bug #3098: arv-mount refers to projects as "groups", doesn't mount the root group, doesn't display names of items correctly.Rejected07/18/2014Actions
Actions #1

Updated by Tom Clegg over 9 years ago

  • Target version set to Arvados Future Sprints
Actions #2

Updated by Tom Clegg over 9 years ago

  • Story points set to 1.0
Actions #3

Updated by Ward Vandewege over 9 years ago

  • Subject changed from Fix arv-mount --tags and --groups modes to handle "home" project and #3036 collections changes. to [SDK] Fix arv-mount --tags and --groups modes to handle "home" project and #3036 collections changes.
Actions #4

Updated by Ward Vandewege over 9 years ago

  • Target version changed from Arvados Future Sprints to 2014-09-17 sprint
Actions #5

Updated by Peter Amstutz over 9 years ago

  • Assigned To set to Peter Amstutz
Actions #6

Updated by Peter Amstutz over 9 years ago

  • Status changed from New to In Progress
Actions #7

Updated by Tom Clegg over 9 years ago

  • Description updated (diff)
Actions #8

Updated by Tim Pierce over 9 years ago

Reviewing at 9479599:

I want to clarify that I understand the thread safety concerns here. Because llfuse ensures that request handlers do not run concurrently, it's not necessary to explicitly protect the code against thread contention -- however, because calls to the API server might cause all llfuse clients to block unnecessarily, each arvados_fuse client explicitly releases the lock when it's waiting on a response from the API server (which is also why each thread gets its own API client).

Is that an accurate summary? The locking protocol was very puzzling until I read the arvados_fuse.Operations doc comment, which seemed to indicate that something like the above is what's happening.

services/fuse/arvados_fuse/__init__.py:
  • The Unix ctime field of an inode isn't the file's creation time, it's the time the metadata (i.e. the inode) was last changed. I think this should probably just be equal to the collection modification time.
  • CollectionDirectory
    • It looks as though a CollectionDirectory will never refresh itself once it's been instantiated: the update() method checks self.collection_object is not None and I don't see any place where this field gets reset to none. Is that intentional?
  • TagsDirectory, SharedDirectory
    • There's code here to invalidate the directory upon any event involving a Link, commented out. Is this unfinished business or code that needs to be removed?
  • FreshBase
    • I don't think it's necessary for FreshBase to provide ctime and mtime methods if they aren't used in the base class. In a way it would be better if they're not provided at all, so we have clear feedback if a derived class fails to implement the method and tries to call it anyway.
Actions #9

Updated by Peter Amstutz over 9 years ago

Tim Pierce wrote:

Reviewing at 9479599:

I want to clarify that I understand the thread safety concerns here. Because llfuse ensures that request handlers do not run concurrently, it's not necessary to explicitly protect the code against thread contention -- however, because calls to the API server might cause all llfuse clients to block unnecessarily, each arvados_fuse client explicitly releases the lock when it's waiting on a response from the API server (which is also why each thread gets its own API client).

Is that an accurate summary? The locking protocol was very puzzling until I read the arvados_fuse.Operations doc comment, which seemed to indicate that something like the above is what's happening.

That's exactly right.

services/fuse/arvados_fuse/__init__.py:
  • The Unix ctime field of an inode isn't the file's creation time, it's the time the metadata (i.e. the inode) was last changed. I think this should probably just be equal to the collection modification time.

Man! That's annoying. Fixed so ctime == mtime. Also added atime().

  • CollectionDirectory
    • It looks as though a CollectionDirectory will never refresh itself once it's been instantiated: the update() method checks self.collection_object is not None and I don't see any place where this field gets reset to none. Is that intentional?

You're missing the second half of that conditional:

if self.collection_object is not None and portable_data_hash_pattern.match(self.collection_locator):
    return True

If the collection_locator is a portable data hash, it's guaranteed to never change its contents. If the collection locator is a uuid, it is possible that the manifest associated with the collection was changed by the user, so then in proceeds to checks to see if the portable_data_hash associated with the collection record has changed.

  • TagsDirectory, SharedDirectory
    • There's code here to invalidate the directory upon any event involving a Link, commented out. Is this unfinished business or code that needs to be removed?

I took this out. This was using websockets, but I was running into problems where it would just hang on connecting. I want to have a better change detection and invalidation scheme than "refresh if the object hasn't been refreshed in the last 60 seconds" but I think that's out of scope for this story.

  • FreshBase
    • I don't think it's necessary for FreshBase to provide ctime and mtime methods if they aren't used in the base class. In a way it would be better if they're not provided at all, so we have clear feedback if a derived class fails to implement the method and tries to call it anyway.

Removed mtime() and ctime() and added atime() (which is used by the base class.)

Actions #10

Updated by Tim Pierce over 9 years ago

Peter Amstutz wrote:

Tim Pierce wrote:

Reviewing at 9479599:

I want to clarify that I understand the thread safety concerns here. Because llfuse ensures that request handlers do not run concurrently, it's not necessary to explicitly protect the code against thread contention -- however, because calls to the API server might cause all llfuse clients to block unnecessarily, each arvados_fuse client explicitly releases the lock when it's waiting on a response from the API server (which is also why each thread gets its own API client).

Is that an accurate summary? The locking protocol was very puzzling until I read the arvados_fuse.Operations doc comment, which seemed to indicate that something like the above is what's happening.

That's exactly right.

OK. It's clever. I don't see any red flags in the thread locking, but I don't think I understand why it's necessary for RecursiveInvalidateDirectory.invalidate to explicitly acquire and release the llfuse lock?

services/fuse/arvados_fuse/__init__.py:
  • The Unix ctime field of an inode isn't the file's creation time, it's the time the metadata (i.e. the inode) was last changed. I think this should probably just be equal to the collection modification time.

Man! That's annoying. Fixed so ctime == mtime. Also added atime().

Nice, thanks.

  • CollectionDirectory
    • It looks as though a CollectionDirectory will never refresh itself once it's been instantiated: the update() method checks self.collection_object is not None and I don't see any place where this field gets reset to none. Is that intentional?

You're missing the second half of that conditional:

[...]

If the collection_locator is a portable data hash, it's guaranteed to never change its contents. If the collection locator is a uuid, it is possible that the manifest associated with the collection was changed by the user, so then in proceeds to checks to see if the portable_data_hash associated with the collection record has changed.

Ah, of course! Thanks for clarifying.

  • TagsDirectory, SharedDirectory
    • There's code here to invalidate the directory upon any event involving a Link, commented out. Is this unfinished business or code that needs to be removed?

I took this out. This was using websockets, but I was running into problems where it would just hang on connecting. I want to have a better change detection and invalidation scheme than "refresh if the object hasn't been refreshed in the last 60 seconds" but I think that's out of scope for this story.

Makes sense. Do we have a story for improving arv-mount/arvados_fuse invalidation logic?

Actions #11

Updated by Peter Amstutz over 9 years ago

Tim Pierce wrote:

OK. It's clever. I don't see any red flags in the thread locking, but I don't think I understand why it's necessary for RecursiveInvalidateDirectory.invalidate to explicitly acquire and release the llfuse lock?

Because that is intended to be called from a thread separate from llfuse which is listening on websockets.

Makes sense. Do we have a story for improving arv-mount/arvados_fuse invalidation logic?

#3833

Fuse tests should pass now.

Actions #12

Updated by Anonymous over 9 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 67 to 100

Applied in changeset arvados|commit:50e48820312868513b6efcafa84c6d825ac51cd4.

Actions

Also available in: Atom PDF