Story #1904

User can get a no-auth-required link to an Arvados object, i.e., turn on "anyone with the link can view" permission

Added by Tom Clegg almost 8 years ago. Updated over 7 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Brett Smith
Category:
-
Start date:
04/18/2014
Due date:
% Done:

100%

Estimated time:
(Total: 48.00 h)
Story points:
3.0
Release relationship:
Auto

Description

Proposed approach
  • Owner creates a token scoped to the object being shared. (Semantics of token scopes might need to be clarified: How do you say "read only" here?)
  • Use something like the existing ?api_token= behavior to embed the token into the "link to share", but at least use a different name. Using a "share" link shouldn't interfere with a user's real login session.
  • Propagate the token given in the URL to the "download" links on the collections#show page, so those links can be copied to a wget command line.

Subtasks

Task #2664: Workbench can manage no-auth URLs to access a specific objectClosedBrett Smith

Task #2702: API server accepts a supplemental list of API tokens for additional read-only permissionsResolvedBrett Smith

Task #2663: Workbench supports a "temporary API token" query parameter everywhereResolvedBrett Smith

Task #2687: Review 1904-workbench-temp-token-wipResolvedBrett Smith

Task #2743: Review 1904-wget-collections-wipResolvedPeter Amstutz

Task #2682: Review 1904-object-scopes-wipResolvedPeter Amstutz

Task #2666: Workbench has an interface to share collections using reader tokensResolvedPeter Amstutz

Task #2662: Workbench can get an API token, good for reading only one objectResolvedBrett Smith

Task #2707: Review 1904-api-reader-tokens-wipResolvedTom Clegg

Task #2736: Review 1904-workbench-reader-tokens-wipResolvedTom Clegg

Associated revisions

Revision d2de8071 (diff)
Added by Brett Smith over 7 years ago

api: Simplify API token scope checks.

The previous code was organized with a slightly different idea of how
scopes would be codified. Now that we know they're a check against
the request string, we can simplify a bit to reduce clutter.

Refs #1904.

Revision ab5c781a (diff)
Added by Brett Smith over 7 years ago

api: Improve API token scope tests.

Most of these changes clarify the purpose of the tests for readers.
There's one minor new assertion.

Refs #1904.

Revision 9f7a232b (diff)
Added by Brett Smith over 7 years ago

api: Support filters in API client auths index.

Per comments on Refs #1904. filters is generally the preferred way to
do searching now. I maintained existing limits on what can be
searched with this method.

Revision c00bae55 (diff)
Added by Brett Smith over 7 years ago

api: Style improvements in auth controller.

Refs #1904.

Revision 0eb59e3a (diff)
Added by Brett Smith over 7 years ago

api: Introduce path-based API token scopes.

Refs #1904, #2662 for background discussion.

Revision 5c2758ca (diff)
Added by Brett Smith over 7 years ago

api: Support filters in API client auths index.

Per comments on Refs #1904. filters is generally the preferred way to
do searching now. I maintained existing limits on what can be
searched with this method.

Revision 8086f73a (diff)
Added by Brett Smith over 7 years ago

api: Introduce path-based API token scopes.

Refs #1904, #2662 for background discussion.

Revision 7e8f9955 (diff)
Added by Brett Smith over 7 years ago

api: Support filters in API client auths index.

Per comments on Refs #1904. filters is generally the preferred way to
do searching now. I maintained existing limits on what can be
searched with this method.

Revision 86e039fa (diff)
Added by Brett Smith over 7 years ago

api: Only load reader tokens in scope.

A security improvement suggested by review. Refs #1904.

Revision 8dbf8bd4
Added by Brett Smith over 7 years ago

Merge branch '1904-workbench-reader-tokens'

Refs #1904. Closes #2663, #2736.

Revision 5c8db6d4 (diff)
Added by Brett Smith over 7 years ago

workbench: Make Keep file EPERM test stricter.

Refs #1904. Tom said that we want the security benefits of a 404
result, so make sure that's exactly what we get.

Revision 928bb1ab (diff)
Added by Brett Smith over 7 years ago

1904: Make Keep file EPERM test stricter.

Refs #1904. Tom said that we want the security benefits of a 404
result, so make sure that's exactly what we get.

Revision f876a322 (diff)
Added by Brett Smith over 7 years ago

1904: Make Keep file EPERM test stricter.

Refs #1904. Tom said that we want the security benefits of a 404
result, so make sure that's exactly what we get.

History

#1 Updated by Tom Clegg over 7 years ago

  • Target version set to 2014-05-07 Storing and Organizing Data

#2 Updated by Tom Clegg over 7 years ago

  • Story points set to 2.0

#3 Updated by Tom Clegg over 7 years ago

  • Release set to 8
  • Target version deleted (2014-05-07 Storing and Organizing Data)

#4 Updated by Tom Clegg over 7 years ago

  • Target version set to 2014-05-07 Storing and Organizing Data

#5 Updated by Tom Clegg over 7 years ago

  • Description updated (diff)

#6 Updated by Tom Clegg over 7 years ago

  • Story points changed from 2.0 to 3.0

#7 Updated by Brett Smith over 7 years ago

  • Assigned To set to Brett Smith

#8 Updated by Tom Clegg over 7 years ago

  • Subject changed from User can get a public permalink to an Arvados object, i.e., turn on "anyone with the link can view" permission to User can get a no-auth-required link to an Arvados object, i.e., turn on "anyone with the link can view" permission
  • Description updated (diff)

#9 Updated by Peter Amstutz over 7 years ago

Comments on 2fc6dde 1904-object-scopes-wip

  1. current_api_client_auth_has_scope is a little hard to follow. Suggestions:
    • Does ok_scopes need to be an array? As I understand it, this is the path the scopes are being tested against.
    • Rename 'ok_scopes' to a more descriptive name such as 'request_path'
    • Rename block parameter 'scope' to a more descriptive name such as 'token_scope'
    • If ok_scopes/request_path is not an array, the logic inside block is a little bit simpler.
  2. What's the effect of "https!" in #request_with_auth? Add comment
  3. test "user list token can only list users" -> What happens with a request path of '/arvados/v1/users/'?
  4. test "multiple scopes" -> This test seems to be testing more than just the "multiple scopes" title would suggest. Might be better to break it out into a test more narrowly focused on testing multiple scopes, and a second test related to manipulating scopes.
  5. In the API server, having access to the index basically means having access to all the objects in that table that are otherwise readable. This seems to make a mockery of the /arvados/v1/foo vs. /arvados/v1/foo/ distinction we were trying to draw the other day.

#10 Updated by Brett Smith over 7 years ago

Peter Amstutz wrote:

Comments on 2fc6dde 1904-object-scopes-wip

  1. current_api_client_auth_has_scope is a little hard to follow.

Refactored mostly as you suggested. Also cleaned up the now-obsolete distinction between require_auth_scope and require_auth_scope_all.

  1. What's the effect of "https!" in #request_with_auth? Add comment

HTTPS requests were needed when these were controller tests (not 100% sure why) but seem to be unnecessary for integration tests. Removed.

  1. test "user list token can only list users" -> What happens with a request path of '/arvados/v1/users/'?

Added a test for this. It passes.

  1. test "multiple scopes" -> This test seems to be testing more than just the "multiple scopes" title would suggest. Might be better to break it out into a test more narrowly focused on testing multiple scopes, and a second test related to manipulating scopes.

I think I see what you're getting at. My feeling is that the extra code is just checking that the POST request actually did go through—not that we care about the particulars of how the new token was created, etc.—and that any test of a POST scope would probably look similar. I made some changes to try to help clarify the goals here. Do they alleviate your concerns?

  1. In the API server, having access to the index basically means having access to all the objects in that table that are otherwise readable. This seems to make a mockery of the /arvados/v1/foo vs. /arvados/v1/foo/ distinction we were trying to draw the other day.

Good point. I think this needs further discussion in the team.

#11 Updated by Tom Clegg over 7 years ago

Comments on 1904-workbench-temp-token-wip @ 4425dbc002ec66aa18a6769d9c1aba46c8d30586

This behavior seems slightly surprising: when I'm using a "ticket", I don't get to use the privileges or context cues that I would normally have by virtue of being logged in. E.g., the "Logged in as " item in the "account stuff" drop-down will not show my email address, and any clever "should I offer to do this?" code in Workbench would conclude I'm unable to add a bookmark/tag to this shared item.

Is this really what we want, or would it be more desirable to restrict the use of the per-request "ticket" to the "get the requested collection" transaction, for example, and use the usual session token for all the other auxiliary things that happen in Workbench? (Setting aside for a moment the question of how this would be implemented...)

For the purpose of this story, it isn't necessary to support the "anonymous" case, i.e., it's OK if you need to go through the login process and accept a user agreement before seeing the shared item. (We can wait for the more general "anonymous browsing" story to make the login procedure optional.)

#12 Updated by Brett Smith over 7 years ago

Brett Smith wrote:

  1. In the API server, having access to the index basically means having access to all the objects in that table that are otherwise readable. This seems to make a mockery of the /arvados/v1/foo vs. /arvados/v1/foo/ distinction we were trying to draw the other day.

Good point. I think this needs further discussion in the team.

This was discussed and the general feeling seems to be that we should go ahead with merging this version. So Peter, 1904-object-scopes-wip is back to you for another round of review.

#13 Updated by Brett Smith over 7 years ago

Tom Clegg wrote:

Comments on 1904-workbench-temp-token-wip @ 4425dbc002ec66aa18a6769d9c1aba46c8d30586

This behavior seems slightly surprising: when I'm using a "ticket", I don't get to use the privileges or context cues that I would normally have by virtue of being logged in. E.g., the "Logged in as " item in the "account stuff" drop-down will not show my email address, and any clever "should I offer to do this?" code in Workbench would conclude I'm unable to add a bookmark/tag to this shared item.

Is this really what we want, or would it be more desirable to restrict the use of the per-request "ticket" to the "get the requested collection" transaction, for example, and use the usual session token for all the other auxiliary things that happen in Workbench? (Setting aside for a moment the question of how this would be implemented...)

I think that's fair, and I don't think it would be difficult to implement. I can go ahead and work toward this.

For the purpose of this story, it isn't necessary to support the "anonymous" case, i.e., it's OK if you need to go through the login process and accept a user agreement before seeing the shared item. (We can wait for the more general "anonymous browsing" story to make the login procedure optional.)

Wouldn't that break the "give someone a Collection link that they can feed to wget -r" case?

#14 Updated by Tom Clegg over 7 years ago

Brett Smith wrote:

For the purpose of this story, it isn't necessary to support the "anonymous" case, i.e., it's OK if you need to go through the login process and accept a user agreement before seeing the shared item. (We can wait for the more general "anonymous browsing" story to make the login procedure optional.)

Wouldn't that break the "give someone a Collection link that they can feed to wget -r" case?

Good point.

Perhaps "use api_ticket for everything if no api_token provided" at least for requests like "download a file" that don't do a bunch of other context-sensitive API transactions anyway.

It occurs to me that "wget -r .../collections/uuid" (full blown workbench page) would not be a great way to download. I suppose it would be nice to provide a very plain directory listing at /collections/uuid/. (And that would be another action in the special class: "if api_ticket is given, don't require api_token because the only API transactions we're going to do are the ones that use the ticket anyway".)

#15 Updated by Brett Smith over 7 years ago

Discussed temporary tokens in IRC with Tom some more. We want to explore the possibility of modifying the API server to let clients send along zero or more tokens. Then Workbench can accept those in a parameter and pass them along. Currently investigating what that will take.

#16 Updated by Peter Amstutz over 7 years ago

Reviewing deb9df3

Stupid style pedantry:

  1. api_client_authorizations_controller.rb:
    unless @where['scopes'].nil?
    

    Double-negative that made me double-take, should be
    if @where['scopes']
    
  2. In test "admin search filters where scopes exactly match" we're trying to deprecate 'where' in favor of 'filters' for searching.

Looks good otherwise.

#17 Updated by Brett Smith over 7 years ago

Peter Amstutz wrote:

  1. In test "admin search filters where scopes exactly match" we're trying to deprecate 'where' in favor of 'filters' for searching.

I implemented support for filters in ApiClientAuthorizationsController#find_objects_for_index, maintaining the existing limits on what's searchable. This also obsoleted the style point you brought up. Ready for another review. Thanks.

#18 Updated by Peter Amstutz over 7 years ago

Revision 9f7a232
A couple more style points. Being clear is better than clever (especially since Ruby affords infinite potential for writing unreadable clever code.)

1. .andand is not part of the core language and makes me have to think harder about what the code is doing, especially since you're calling a mutator function with #select!

    if @filters
      wanted_scopes.concat(@filters.map { |attr, operator, operand|
        ((attr == 'scopes') and (operator == '=')) ? operand : nil
      })
    end
    @where.andand.select! { |attr, val| attr == 'uuid' }
    @filters.andand.select! { |attr, operator, operand|
      (attr == 'uuid') and (operator == '=')
    }

Would be more clearer with regular old boring 'if'

    if @filters
      wanted_scopes.concat(@filters.map { |attr, operator, operand|
        ((attr == 'scopes') and (operator == '=')) ? operand : nil
      })
      @filters.select! { |attr, operator, operand|
        (attr == 'uuid') and (operator == '=')
      }
    end
    @where.select! { |attr, val| attr == 'uuid' } if @where

2. Why (auth.scopes & scope_list) (auth.scopes | scope_list)? This seems like a convoluted way to say (auth.scopes scope_list) but I'm sure you have a good reason (does & and | implicitly sort the list?). Add a comment.

#19 Updated by Tom Clegg over 7 years ago

Notes, looking at 5902a68d3f8672c38949d4a57f12435f40923a23

You don't need to "include ArvadosTestSupport" again in IntegrationTest -- IntegrationTest is a subclass of TestCase.

I'd like an integration test that params[:reader_tokens] can be supplied as JSON. (You might need an #accept_param_as_json along the lines of #accept_attribute_as_json?)

I'd like to see a test for "ticket created by user A scoped to single object X + token created by user B with scope=all + request object Y which is readable by A but not by B = 404.". From my reading it looks like user B could get all of A's objects that way.

(I feel like I should stop here in case that throws a wrench into the whole works...)

#20 Updated by Brett Smith over 7 years ago

Tom Clegg wrote:

Notes, looking at 5902a68d3f8672c38949d4a57f12435f40923a23

All good suggestions. Pushed all these changes.

I'd like to see a test for "ticket created by user A scoped to single object X + token created by user B with scope=all + request object Y which is readable by A but not by B = 404.". From my reading it looks like user B could get all of A's objects that way.

(I feel like I should stop here in case that throws a wrench into the whole works...)

It did require a little retooling, but not as much as you might've feared. And I think the changes actually make for cleaner code in the end.

#21 Updated by Tom Clegg over 7 years ago

Looks great.

The only nitpick I found is that admins can't use their is_admin status to create tokens to arbitrary objects/scopes. That's not necessarily a bad thing, as long as the "create link to share" UI is aware of that limitation. The shortcoming I actually ran into is that "index" doesn't include individually scoped objects. But I don't think that's a problem.

So I think you should merge.

#22 Updated by Tom Clegg over 7 years ago

At f7dc278

Looks great.

(As for the "no permission -> 404" comment in the commit message, I think this really is the best result. Saying 403 instead can be a real information leak: "Yes, now that you mention it, somebody at this site has indeed been working with that dataset, but nobody has suggested you should ever know anything about that.")

#23 Updated by Brett Smith over 7 years ago

  • Status changed from New to Resolved

This story should've been multiple stories. Candidates include:

  • Specify and implement scoped API tokens.
  • API server accepts additional tokens for read-only operations.
  • Workbench accepts additional tokens for reading collections.
  • Workbench can manage a read-only token for a collection, for public sharing.
  • A user with a read-only token can wget -r a collection. (This is not insignificant because you have to worry about robots.txt, making the output useful to wget, and maybe other things).

The first two bullets were done this sprint. We made progress on all the others, too (refer to the tasks), but only Workbench reader tokens landed in master, and further testing has shown it's buggy. We will revisit the remaining stories next sprint, individually.

Also available in: Atom PDF