Story #3781

[DRAFT] [Workbench] Web-based file upload

Added by Peter Amstutz about 6 years ago. Updated almost 6 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Workbench
Target version:
Start date:
11/24/2014
Due date:
% Done:

100%

Estimated time:
(Total: 9.00 h)
Story points:
3.0

Description

Provide a web-based way to upload files to a Keep collection.

Users who are nontechnical, unable to install arv-put for technical reasons, or simply want to upload a small number of files without fussing around with the command line would benefit from being able to do so through the browser.

Use some combination of http://www.dropzonejs.com/ and https://github.com/curoverse/longupload to upload data either through Workbench or (better) directly to a Keep proxy using the HTML5 storage API to keep a resume cache.


Subtasks

Task #4657: Queue file uploads with "choose files" buttonResolvedTom Clegg

Task #4655: Enable CORS ACAO in keepproxyResolvedTom Clegg

Task #4664: Enable CORS ACAO in apiserverResolvedTom Clegg

Task #4658: Work upload queueResolvedTom Clegg

Task #4660: Generate quick file checksums, store progress in localStorageClosedTom Clegg

Task #4656: Add collections.append APIClosedTom Clegg

Task #4659: Show status and errors in queueResolvedTom Clegg

Task #4679: Review 3781-browser-friendly-serversResolvedPeter Amstutz

Task #4680: Review 3781-browser-uploadResolvedTom Clegg

Task #4734: Write testsResolvedTom Clegg


Related issues

Related to Arvados - Bug #4809: [Workbench] Code pops up for a second using web uploaderNew12/12/2014

Associated revisions

Revision cedbb7fc
Added by Tom Clegg almost 6 years ago

Merge branch '3781-browser-friendly-servers' refs #3781

Revision 8453812f
Added by Tom Clegg almost 6 years ago

Merge branch '3781-browser-upload' closes #3781

History

#1 Updated by Peter Amstutz about 6 years ago

  • Description updated (diff)
  • Category set to Workbench

#2 Updated by Ward Vandewege about 6 years ago

  • Priority changed from Normal to Low

#3 Updated by Tom Clegg about 6 years ago

  • Target version set to Arvados Future Sprints

#4 Updated by Tom Clegg about 6 years ago

  • Subject changed from [Workbench] Web-based file upload to [DRAFT] [Workbench] Web-based file upload

#5 Updated by Tom Clegg about 6 years ago

  • Story points set to 3.0

#6 Updated by Tom Clegg about 6 years ago

  • Description updated (diff)
  • Priority changed from Low to Normal

#7 Updated by Tom Clegg about 6 years ago

  • Target version changed from Arvados Future Sprints to 2014-12-10 sprint

#8 Updated by Tom Clegg about 6 years ago

  • Status changed from New to In Progress

#9 Updated by Tom Clegg about 6 years ago

Todo @ d95550d:
  • More tests for new keepproxy routes?
  • Review CORS implications
  • No CORS headers via OPTIONS under user_sessions
  • Add tests for no CORS at user_sessions
  • Upload form should not disappear after flipping tabs
  • Clean up class definitions
  • Fix Pause button
  • Start automatically after selecting files?
  • Check error reporting after fail, pause
  • Add tests for upload
    • uploads keep running while flipping tabs
    • uploads resume after pause
    • binary data not mangled
    • handle duplicate filenames
    • trash btn cancels ajax upload
    • proxy error gets reported
    • retry slice
    • zero byte file (check manifest format, progressbar)

#10 Updated by Brett Smith about 6 years ago

  • Assigned To set to Tom Clegg

#11 Updated by Tom Clegg about 6 years ago

3781-browser-friendly-servers @ f12f617 21d528d

This branch adds server-side features to keepproxy and apiserver to support browser-based upload:
  • CORS headers via GET and preflight (OPTIONS-before-PUT/POST) on resources that don't use cookie-based authentication. Without this, browsers will refuse to give JS applications any GET responses, or make any PUT/POST requests at all, unless workbench+keepproxy+apiserver are all at the same domain name.
  • keepproxy supports "POST /" -- behaves exactly like "PUT /{hash}" but, of course, can't detect data corruption between the browser and keepproxy. Without this, we'd have to compute md5 from file slice data in javascript, which is possible but (last time I checked, using a native js library) unusably slow.

#12 Updated by Tom Clegg about 6 years ago

3781-browser-upload @ 12b9d77 29b0dcc

This branch adds browser-based upload. Before I put too many finishing touches (and tests) on this I'd like some feedback on the gist of the approach, UI, implementation, and feature priorities.

How to preview

You must have a keepproxy running. In principle this could be improved to work directly with 'disk' services, but currently it uses the first keep proxy offered by apiserver (it doesn't even use the "available" API: it just filters by type because it knows it can't use disks directly).
  • Browser policy dictates that if your workbench app is https, your keepproxy must be https. I dealt with this by running a plain-http Workbench. keepproxy doesn't do ssl natively so the other way is to put an https proxy in front of your http keepproxy.
  • Your keepproxy and apiserver must have the stuff from 3781-browser-friendly-servers. (That branch is merged into 3781-browser-upload.)
  • For best coverage, point your keep servers to a small tmpfs. Find out what "full" looks like in the browser...

Every writable collection gets an Upload tab.

The "show project" page's "Add data" button is now a dropdown with two items: "copy from elsewhere" (as before) and "upload" (which makes a new collection and puts you in its Upload tab).

The "upload" tab itself should be self-explanatory. (If not, that should probably be fixed.)

Done/not-done

Done and expected to work:
  • Select multiple files
  • Upload large files
  • Pause and resume
  • Show upload progress (including mid-64MiB-block)
  • Recover from occasional errors transparently (don't give up until 3 attempts on the same block)
  • Stop/error and resume (e.g., turn off your keepproxy, then turn it back on and click resume)
  • Choose more files while the upload is in progress
  • Switch tabs while the upload is in progress
  • Don't break existing tests
Not done:
  • Test cases. Will require adding keepproxy to run-tests/run-test-servers.
  • Minimum delay between attempts on the same block. This would make the retry feature easier to test, and maybe it's a good idea in real life too?
Not done and are (IMO) not necessary to merge:
  • Drag & drop area. Probably pretty easy.
  • Start automatically after adding files. Early versions had this, hard to say whether it's a bug or a feature. Currently you have to hit "go" after selecting files.
  • Javascript unit tests with jasmine or something. This seems valuable, but it might make more sense to merge and rely on browser-driven tests for the time being.

#13 Updated by Peter Amstutz almost 6 years ago

3781-browser-friendly-servers 21d528d

  • CORS allows sites to specify what other sites they will accept cross-origin requests from. This seems to be primarily targeted at situations where the attacker is serving a page which tries to use the user's browser to access to a non-cooperating 3rd party (e.g. the user's bank account).
  • In our case, we are concerned with the security of communication between the 1st party (browser) and a cooperating 2nd party (API server or keepproxy).
  • I think instead of CORS we have a different problem: Cross-site request forgery (CSRF). We want to avoid leaking the API token to a 3rd party.
    • Right now, after log in, the token gets stored in the session.
    • A CSRF or XSS attack could enable an attacker to send the session and api token to a 3rd party server under the attacker's control which will happily provide the proper CORS headers to accept the user's credentials.
    • For the session, this probably isn't a problem because the session cookie is encrypted using workbench's "secret_token" (note: the session cookie is only encrypted by default in rails 4, in rails 3 the cookie is signed but not encrypted!).
    • However, for websockets, and presumably for API and keepproxy access, we are currently providing the API token in the clear:
      <meta name="arv-websocket-url" content="wss://ws.qr1hi.arvadosapi.com/websocket?api_token=xxxxx">
    • If we want to send the encrypted session to API server, we can probably provide a custom session Rack layer that handles Rails 4 sessions: https://stackoverflow.com/questions/1941499/given-the-session-key-and-secret-how-can-we-decrypt-rails-cookies
    • To send to keepproxy, we can decrypt it in Go: http://big-elephants.com/2014-01/handling-rails-4-sessions-with-go/ I'm not sure about
    • This requires the Rails secret_token be shared between Workbench, API and keepproxy
  • Alternately we don't have to share the session, but use some other header instead that has the encrypted API token. This still requires a shared secret between services. We also need to generate and verify the encrypted token in such a way that it can't be trivially re-used in a replay attack.

Hmmm.

#14 Updated by Tom Clegg almost 6 years ago

Background (perhaps a bit late):

With the existing code, if evil.example.com serves JavaScript application which calls $.ajax('https://zzzzz.arvadosapi.com/arvados/v1/users/current'), the browser makes a GET request but does not reveal the response to the malicious JavaScript program. If the same application calls $.post(), the browser makes an OPTIONS request at API server to check for CORS permission, but the OPTIONS request fails and the browser therefore declines to do the POST request.

With the proposed change, evil.example.com's JavaScript app will be able to make these GET and POST requests.
  • In the general case this is unsafe because AJAX requests might include cookies and zzzzz.arvadosapi.com might uses those cookies to permit actions and reveal information. Therefore, even though evil.example.com does not know the cookie credentials, it can effectively use them to perform actions and reveal information (and relay that information back to itself using another AJAX call).
  • This is purportedly safe in certain cases where those GET and POST requests can't do anything evil.example.com's web server couldn't have done just as well by itself. In such cases, zzzzz.arvadosapi.com can use the CORS pre-flight/header mechanism to instruct the browser to permit such cross-origin requests.
  • Specifically, keepproxy and most routes at API server use Authorization headers rather than cookies. In order for evil.example.com to send a token this way, it would need to know the user's token -- and in that case, it would be able to make API requests directly anyway, without the help of CORS headers.

#15 Updated by Peter Amstutz almost 6 years ago

3781-browser-friendly-servers LGTM

#16 Updated by Peter Amstutz almost 6 years ago

Also the CSRF issue is now #4690

#17 Updated by Tom Clegg almost 6 years ago

For posterity, bugfix 11bfce5 misfiled as 4533

#18 Updated by Brett Smith almost 6 years ago

Reviewing 29b0dcc

Interface

  • Two points on the "Add data" button:
    • The pulldown goes off the right edge of my browser, prompting a horizontal scrollbar. Obviously this is dependent on a lot of my setup, but it seems like we'll almost always want it to veer left, like the account pulldown. Is it possible to hint that?
    • This is a very soft comment, I admit, but the button just looks out of place, being a white pulldown next to two blue buttons. It's a little easy to miss (camouflaged with the background!) and no other page has a pulldown there. Is there another presentation that might make more sense?
  • Reporting sizes as "4 KiB" rather than "4K" would be more consistent with the rest of Workbench. Plus "0K" can be mistaken for the word "OK."
  • After I successfully upload files, the Start button is enabled, even though there's nothing to start.

Code substance

  • I'm concerned that uniqueNameForManifest won't scale well with large manifests, primarily because it splits the string into a (potentially large) array for each filename search. I don't think it would be unreasonable to mimic the same behavior by searching incrementally for newlines and slicing the string as you go. If that feels like too much for you, I think we should at least split the string once and then keep searching that array until we have a unique name.
  • Per the Rails docs, link_to :method argument should be a symbol in projects/show.html.erb, and should not be specified when you want to GET.
  • doQueueWithProxy should declare keepProxy with var.
  • The "1001 nights" comment sent me off to puzzle out the units. Specifying that it's milliseconds might save the next reader several thousand. ;) I see that this already exists elsewhere our code… but, still.
  • In console.log("onUploaderResolve but locator=" + locator + " and " + _currentSlice.size + " != " + dataSize); - the log message implies that the conditional uses "and," but it's actually "or." I realize just changing the word doesn't quite get the point across either, so I'm not sure what the best wording is.

Code style

I realize that we have the fewest conventions around JavaScript style, and there's nothing saying you to have to follow my suggestions here. I guess in part that I'm arguing for these rules generally. So, uh, here are my highly subjective opinions; take them as you will.

  • I feel like it would be good for us to use === and !== as much as possible, to avoid surprising type coercions.
  • Personally I like to use braces after every conditional. I don't mind if they get omitted in the very simple single-statement block case, but I feel semi-strongly about having them when the "single statement" is itself compound, as in UploadToCollection.addFilesToQueue and QueueUploader.doQueueWork (could this one be flattened by initializing the loop with i=1?).
  • I suggest $.each for simple array iteration, particularly the two loops in QueueUploader.appendToCollection (twice).

Thanks.

#19 Updated by Tom Clegg almost 6 years ago

Brett Smith wrote:

  • The pulldown goes off the right edge of my browser, prompting a horizontal scrollbar. Obviously this is dependent on a lot of my setup, but it seems like we'll almost always want it to veer left, like the account pulldown. Is it possible to hint that?

Indeed, fixed. (class="pull-right", easy!)

  • This is a very soft comment, I admit, but the button just looks out of place, being a white pulldown next to two blue buttons. It's a little easy to miss (camouflaged with the background!) and no other page has a pulldown there. Is there another presentation that might make more sense?

Fixed (I think) by making it look just like the other buttons.

  • Reporting sizes as "4 KiB" rather than "4K" would be more consistent with the rest of Workbench. Plus "0K" can be mistaken for the word "OK."

Fixed, db09f10 (size, progress, and speed all say KiB now)

  • After I successfully upload files, the Start button is enabled, even though there's nothing to start.

Fixed.

  • I'm concerned that uniqueNameForManifest won't scale well with large manifests, primarily because it splits the string into a (potentially large) array for each filename search. I don't think it would be unreasonable to mimic the same behavior by searching incrementally for newlines and slicing the string as you go. If that feels like too much for you, I think we should at least split the string once and then keep searching that array until we have a unique name.

Fixed in 90ecd74 + 254bd07 by using regexps instead of split(). It still keeps a copy of the entire current line of the manifest in memory, but at least now it's just one string rather than the previous array of all lines plus array of all tokens in the current line.

It would be more efficient to just track the start/end indices of the current line, and search for tokens within those bounds, but I think JS's limited regexp facilities could turn that into a bigger job (and a bigger, uglier, chunk of code) than it deserves.

I considered iterating over all tokens in the manifest without regard to line breaks, and relying on the fact that stream names always start with "." and no other kind of token ever does. Somehow this feels a bit sketchy, although it doesn't seem like it will break on any well-formed manifests, or do worse than the existing code on an ill-formed manifest. It would get rid of the remaining large string copy, which would be nice. WDYT?

  • Per the Rails docs, link_to :method argument should be a symbol in projects/show.html.erb, and should not be specified when you want to GET.

Fixed during merge f59d6d7. (Hey, no fair, you fixed these in master and merged them while this branch was waiting for review!)

  • doQueueWithProxy should declare keepProxy with var.

That keepProxy isn't local to the function, it's an UploadToCollection instance var, declared way up before SliceReader.

I agree it's a bit hard to follow, though. Perhaps it should be an attribute of $scope instead?

  • The "1001 nights" comment sent me off to puzzle out the units. Specifying that it's milliseconds might save the next reader several thousand. ;) I see that this already exists elsewhere our code… but, still.

Noted milliseconds.

  • In console.log("onUploaderResolve but locator=" + locator + " and " + _currentSlice.size + " != " + dataSize); - the log message implies that the conditional uses "and," but it's actually "or." I realize just changing the word doesn't quite get the point across either, so I'm not sure what the best wording is.

Hmph. It turns out the second part of that was just testing whether the SliceUploader successfully copied an int from one variable to another anyway. I've replaced it in 1c8feea with a test based on the size hint in the returned locator, and the error message now looks like this:

onUploaderResolve, but locator '7c32a62d7e9ddead2ae62ad8b70ec792+94727+Ac1e4c253a159f9d7c8f1d5eecfc44e556e0acd1f@5493240b' with size hint '94727' does not look right for dataSize=94728
  • I feel like it would be good for us to use === and !== as much as possible, to avoid surprising type coercions.

Agreed, fixed several. (This is one reason coffeescript seems like a good idea...)

  • Personally I like to use braces after every conditional. I don't mind if they get omitted in the very simple single-statement block case, but I feel semi-strongly about having them when the "single statement" is itself compound, as in UploadToCollection.addFilesToQueue and QueueUploader.doQueueWork (could this one be flattened by initializing the loop with i=1?).

Both cases were actually an empty loop followed by (not containing) a loop/condition, so your point about readability is well made. I've rearranged both to have something in the loop body, added comments, and fixed the variable names to be more helpful too.

...then I noticed a bug where the last item doesn't get pushed to the bottom of the queue like everyone else, and fixed it by deleting the "push any Done things to bottom" stuff and rewriting it as "push the thing(s) I just did to the bottom" during appendToCollection.

  • I suggest $.each for simple array iteration, particularly the two loops in QueueUploader.appendToCollection (twice).

Fixed. (Except the second one, which is no longer simple enough because it reorders the array.)

Summary:
  • Now at 254bd07
  • View diffs since note-18, not including "merge master": git diff f59d6d7...254bd07

#20 Updated by Brett Smith almost 6 years ago

Tom Clegg wrote:

Brett Smith wrote:

  • I'm concerned that uniqueNameForManifest won't scale well with large manifests, primarily because it splits the string into a (potentially large) array for each filename search. I don't think it would be unreasonable to mimic the same behavior by searching incrementally for newlines and slicing the string as you go. If that feels like too much for you, I think we should at least split the string once and then keep searching that array until we have a unique name.

[snip]

I considered iterating over all tokens in the manifest without regard to line breaks, and relying on the fact that stream names always start with "." and no other kind of token ever does. Somehow this feels a bit sketchy, although it doesn't seem like it will break on any well-formed manifests, or do worse than the existing code on an ill-formed manifest. It would get rid of the remaining large string copy, which would be nice. WDYT?

I prefer more robust parsing to that much efficiency. One potentially large string might be an issue, but it's still just one object that we create once and hold in memory exactly as long as we need it, so you're being pretty responsible with it. I'd prefer to wait and see if this actually becomes an issue before compromising the parser.

  • Per the Rails docs, link_to :method argument should be a symbol in projects/show.html.erb, and should not be specified when you want to GET.

Fixed during merge f59d6d7. (Hey, no fair, you fixed these in master and merged them while this branch was waiting for review!)

Urgh, I'm sorry, I know how much that sucks. :( The one thing I'll say in my defense is that I think it would've played out differently if #4705 hadn't interrupted my priorities.

  • doQueueWithProxy should declare keepProxy with var.

That keepProxy isn't local to the function, it's an UploadToCollection instance var, declared way up before SliceReader.

I agree it's a bit hard to follow, though. Perhaps it should be an attribute of $scope instead?

Hrrm. I don't have a strong enough sense of what $scope is for have an educated opinion on that. Sorry for my misreading; thanks for explaining.

  • I feel like it would be good for us to use === and !== as much as possible, to avoid surprising type coercions.

Agreed, fixed several. (This is one reason coffeescript seems like a good idea...)

I noticed that there are still some two-character operators in the Angular expressions. Does Angular require this, or expect it as established style?

Comments on 254bd07:

Scalability

In a fit of pique, I opened the file browser, selected 1000 files, and tried to queue them all at once. Two funny results.

First, my browser hung for about a minute with the file dialog opened and "Open" clicked before it did anything. I was kind of expecting the Firefox "slow script" dialog to appear, but it didn't; it just kept chugging away to completion. (It did complete, so, huzzah!) I realize that there's a lot out of your control here, and that handling this case well is always going to be difficult… but I just know in my bones that, with this feature, someone is going to go into a large directory, Ctrl+A all the files, and try to upload them at once. If there's some low-hanging fruit you can implement to make this perform better, I think it'd be worthwhile; but if not, I wanted to at least note the issue.

Second, when I started the upload (yup, that worked too, huzzah!), there was a noticeable pause in between file uploads, compared to the case of uploading a smaller number of files. I think this happens because the JavaScript spends a lot more time rearranging the large queue array. It got me wondering: would it make sense to have a separate array for done items, and just move things from the queue to there when they finish? The interface display code would have to know to look at both the queue and the done list, but I think it might ultimately simplify the code and provide a noticeable performance gain. If you want to punt on this, though, I understand.

Timing bugs

I think there might be a race condition if the user pauses in close proximity to a file upload finishing. Steps to reproduce:

  • Choose a small but not miniscule file to upload—tens of kibibytes.
  • Start an upload.
  • Click the Pause button. Quick, while you can!
  • Assuming your network behaved, the uploader reports that the upload is done. (You can go to the Files tab and you should be able to confirm that it's there.)
  • In the uploader, choose a different file to upload, and upload it. You will probably notice the progress bar of the first file flicker, even though it's marked finished.
  • Go to the Files tab. The first file that you uploaded appears in the collection twice.

It's a little touchy and might take two or three tries to reproduce, but 4xphq-4zz18-fqe7azfx1l1owa3 is an example of what can happen if you're playing with it and really hammering it. I did not consciously decide to re-upload most of these files (the one exception is agpl-3.0 (2).txt), but there are many duplicates.

There's also a minor bug where it looks like the upload tab might retain some state, and flicker elements from the previous view when you open it between page loads. Steps to reproduce:

  • Go to a Collection upload tab, and make some interesting state. Ideally, upload a few files. At least queue some up.
  • Leave the page.
  • Visit a different Collection page.
  • Go to the upload tab.

I usually see some UI elements flicker—a success/error message, and progress bars—before the upload tab renders its initial state. It's reminiscent of seeing browsers flicker unstyled HTML before they loaded CSS. I think it's fair to punt on this if the fix is relatively involved.

Presentation

For the "Copy data from a different project" menu item, I'd suggest "another project" as more pleasant wording (plus it would be consistent with the item's title attribute).

These are low severity and can be punted if you want to make separate issues for them:

  • When I pause, the "stopped: interrupted" message is presented like an error, even though it's expected behavior. It would be nice if the presentation were a little less dramatic.
  • If I click "Add data" to pull down the menu, then click "Add data" again, the pulldown is dismissed (as expected) but the button sticks in the shaded state. Even when I move my mouse out of it (so :hover no longer applies), it's shaded until I click something else.

Thanks.

#21 Updated by Tom Clegg almost 6 years ago

Brett Smith wrote:

I prefer more robust parsing to that much efficiency. One potentially large string might be an issue, but it's still just one object that we create once and hold in memory exactly as long as we need it, so you're being pretty responsible with it. I'd prefer to wait and see if this actually becomes an issue before compromising the parser.

Agreed, I'll leave it where it is for now.

Urgh, I'm sorry, I know how much that sucks. :( The one thing I'll say in my defense is that I think it would've played out differently if #4705 hadn't interrupted my priorities.

No worries, I was just practicing whining.

That keepProxy isn't local to the function, it's an UploadToCollection instance var, declared way up before SliceReader.

I agree it's a bit hard to follow, though. Perhaps it should be an attribute of $scope instead?

Hrrm. I don't have a strong enough sense of what $scope is for have an educated opinion on that. Sorry for my misreading; thanks for explaining.

NP. On second thought, putting it in $scope would make it available to others (like the view), which isn't the intent. I suppose a naming scheme for private instance variables could make this less surprising, but for now I'm OK leaving it alone if you are.

I noticed that there are still some two-character operators in the Angular expressions. Does Angular require this, or expect it as established style?

Whoops, no, that's just me forgetting to check the other two files when I checked upload_to_collection.js. I've fixed them in the view and in arvados_client.js too in f08c718.

First, my browser hung for about a minute with the file dialog opened and "Open" clicked before it did anything. I was kind of expecting the Firefox "slow script" dialog to appear, but it didn't; it just kept chugging away to completion. (It did complete, so, huzzah!) I realize that there's a lot out of your control here, and that handling this case well is always going to be difficult… but I just know in my bones that, with this feature, someone is going to go into a large directory, Ctrl+A all the files, and try to upload them at once. If there's some low-hanging fruit you can implement to make this perform better, I think it'd be worthwhile; but if not, I wanted to at least note the issue.

Hm, Chrome took about 3 seconds to queue 1000 files, and 1 second to queue the same set again. I'm inclined to put this on a wish list for the next round.

Second, when I started the upload (yup, that worked too, huzzah!), there was a noticeable pause in between file uploads, compared to the case of uploading a smaller number of files. I think this happens because the JavaScript spends a lot more time rearranging the large queue array. It got me wondering: would it make sense to have a separate array for done items, and just move things from the queue to there when they finish? The interface display code would have to know to look at both the queue and the done list, but I think it might ultimately simplify the code and provide a noticeable performance gain. If you want to punt on this, though, I understand.

Oooh, this part is slow in Chrome too, and locks up the browser for long periods, making it really hard to use the Pause button. I'm not certain where this is on the scale of "angular isn't cut out for this" to "one too many O(M×N) loops" but given the above, I'm mostly inclined to either
  • punt for now, or
  • put a message above the "choose" button: "You can select many files at once, but stick to ~100 at a time to keep your browser peppy." (Better word for "peppy"?)

I was thinking the same thing about having two lists (queue and done) but I don't think that would reduce the array shuffling all that much (if at all). Anyway, I don't want to get too caught up in performance while the alternative is no web upload at all. ;)

I think there might be a race condition if the user pauses in close proximity to a file upload finishing. Steps to reproduce:

Indeed, "pause during appendToCollection" was being handled incorrectly. I've added a state called "Uploaded" (all bytes are stored, but appendToCollection hasn't succeeded) and -- as far as I can tell with my handy collection of 2-byte files -- fixed up the promise handling and the stop() methods. This all makes state=='Done' a much more reliable indicator of whether the file has been added to the collection.

I usually see some UI elements flicker—a success/error message, and progress bars—before the upload tab renders its initial state. It's reminiscent of seeing browsers flicker unstyled HTML before they loaded CSS. I think it's fair to punt on this if the fix is relatively involved.

There's an "ng-cloak" attribute that's supposed to handle this sort of thing. I've added some CSS that supposedly helps, which might make it a bit better (hard to say). Still doesn't seem to work perfectly. Bleh. I'm OK with leaving further perfection on the wish list...

For the "Copy data from a different project" menu item, I'd suggest "another project" as more pleasant wording (plus it would be consistent with the item's title attribute).

SGTM, changed.

Oh, and I noticed the title attributes on both items are totally redundant now that the link text itself is a dropdown-menu-item-length phrase rather than a short button label, so I removed them.

  • When I pause, the "stopped: interrupted" message is presented like an error, even though it's expected behavior. It would be nice if the presentation were a little less dramatic.

Indeed. Changed this to "alert-info" style, with the message "Paused. Click the Start button to resume uploading."

  • If I click "Add data" to pull down the menu, then click "Add data" again, the pulldown is dismissed (as expected) but the button sticks in the shaded state. Even when I move my mouse out of it (so :hover no longer applies), it's shaded until I click something else.

Although I agree it looks odd, the bootstrap examples do the same thing, so I'm inclined to leave it alone. (Perhaps it's deliberate, to make keyboard navigation better?)

Now at 8475c10 15277a3 with master merged.

#22 Updated by Brett Smith almost 6 years ago

Issues noted as of 15277a34:

  • When I upload empty files, the uploader generates the manifest line . 0:0:filename\n. According to our manifest format documentation this isn't legal (every line must list at least one data block), although the Ruby SDK parser copes. If the docs need to be updated, doing so could be a valid fix. :)
  • I managed to pause near the end of the upload such that all files were in the collection, but the uploader displayed the "Paused" message, and both the "Pause" and "Start" buttons were disabled. I have a screenshot if it's helpful.
  • _show_upload.html.erb tries to inspect the .committed boolean, but it no longer exists. (Maybe this is related to the previous issue?)

I'm fine deferring issues as you suggested. Other notes:

Tom Clegg wrote:

I was thinking the same thing about having two lists (queue and done) but I don't think that would reduce the array shuffling all that much (if at all).

I agree it should be the same number of removes and inserts, but I thought it would at least save you from repeatedly searching for the "not done/done" partition point in the array. Plus you could append new items to the end the respective array, rather than sometimes splicing in the middle, which I'm guessing ought to be faster. But I'm fine with dealing with this later.

There's an "ng-cloak" attribute that's supposed to handle this sort of thing. I've added some CSS that supposedly helps, which might make it a bit better (hard to say). Still doesn't seem to work perfectly.

Yeah, it didn't completely solve the issue for me either, sorry. For what it's worth, I noticed that this apparently isn't quite as dependent on previous state as I thought. I definitely saw a flicker that include both a success and error message, so it looks like it's rendering all of the elements before evaluating their ng-show expressions. Also fine with deferring this.

#23 Updated by Tom Clegg almost 6 years ago

Brett Smith wrote:

Issues noted as of 15277a34:

  • When I upload empty files, the uploader generates the manifest line . 0:0:filename\n. According to our manifest format documentation this isn't legal (every line must list at least one data block), although the Ruby SDK parser copes. If the docs need to be updated, doing so could be a valid fix. :)

Ah, good catch. Fixed. (Also fixed progress bar: should be 100%, not 0÷0, for a zero-byte file.)

  • I managed to pause near the end of the upload such that all files were in the collection, but the uploader displayed the "Paused" message, and both the "Pause" and "Start" buttons were disabled. I have a screenshot if it's helpful.
Indeed, I found that clicking "Pause" after appendToCollection starts (but before the last file is Done) would reliably produce this effect. Now (2d5c3de), that situation is handled like this:
  1. Click Pause
    • Queue state changes to Stopped. Info box says "paused". "Start" button is clickable.
    • (Hard to see this state because it doesn't last very long)
  2. Last file ends up getting Done (despite Paused state) because the async collection-update sequence had already been started and the server ends up acknowledging the update.
    • Queue state changes to Done. All files say 100%/done.
    • (This part is unchanged) "Start" and "Pause" buttons are both disabled because there is nothing to do.
  • _show_upload.html.erb tries to inspect the .committed boolean, but it no longer exists. (Maybe this is related to the previous issue?)

Oops, fixed that too, 09cfb00.

#24 Updated by Brett Smith almost 6 years ago

44365f7 is good to merge, thanks.

#25 Updated by Anonymous almost 6 years ago

  • Status changed from In Progress to Resolved

Applied in changeset arvados|commit:8453812fac25bae327b8fa52c5a920b1d921e8be.

Also available in: Atom PDF