[workbench] Enable in-browser preview for text files in collections
The magnifying class should be enabled --
Just displaying the first 100kb is fine.
I suspect a subset will do initially, like
py, R, sh, xsl,
txt, json, xml, tsv, csv
fa, fasta, sam, (not bam), vcf
Most specifically, I want support for *.fa and *.fasta
#6 Updated by Radhika Chippada over 6 years ago
Notes about implementation:
- The gem mime_types supports 1868 mime types. The data file is within the gem. It is currently not supporting just 5 of the types listed by Nancy. These are: fa, fasta, go, r, sam
- Added these unsupported types to mime_types using a rails initializer. These are added as "application" types.
- Added a helper utility that gets the mime type using MIME.
- If the file is text or image per the MIME type, true is returned.
- On the other hand, if it is an application type, true is returned if it is white listed using the config parameter "filename_suffixes_with_view_icon"
- Updated collections/_show_files.html to use the helper utility
- Added tests to verify that each of the type listed in the story description is viewable.
#8 Updated by Brett Smith over 6 years ago
There should be a one-to-one mapping between file formats, and MIME types. Several different file extensions may map to the same MIME type because they refer to the same format (e.g.,
.tar.gz)—and that's good, because it means that we can more easily identify files without sweating the details of the extension. This feature is why it would be helpful to rely on MIME type information exclusively to identify files, and get away from any extension parsing and recognition code.
Along those lines:
- According to Wikipedia, the extensions
.fsacan all refer to FASTA files. All of these should be registered as the same MIME type—probably
application/fasta, as in your first registration.
- It would be good to replace the configuration setting
filename_suffixes_with_view_iconwith something like
application_mimetypes_with_view_icon. This would be a list of MIME subtypes under
application/that are viewable in most browsers. This way, we can just list FASTA files once with a name like
fastarather than listing
fasta(and the other extensions) separately. It also makes it easier for us to handle file names that have more than one extension.
- With run-tests.sh, I got the following test failures:
2) Failure: ActionsControllerTest#test_combine_files_into_new_collection [/home/brett/repos/arvados/apps/workbench/test/controllers/actions_controller_test.rb:39]: Not found 0:0:file1 0:0:file2 0:0:file3 in new collection manifest text 3) Failure: ActionsControllerTest#test_combine_files_with_repeated_names_into_new_collection [/home/brett/repos/arvados/apps/workbench/test/controllers/actions_controller_test.rb:69]: Not found 0:0:file1 0:0:file2 0:0:file3 in new collection manifest text
If I run just that test suite, I get a couple more:
1) Failure: ActionsControllerTest#test_combine_collections_with_repeated_filenames_in_almost_similar_directories_and_expect_files_with_proper_suffixes [/home/brett/repos/arvados/apps/workbench/test/controllers/actions_controller_test.rb:100]: Not found: alice(1) in dir1 in manifest text ./dir1 92b53930db60fe94be2a73fc771ba921+34+A9b99704178af2818f0581590e6997dd99f7ac4e4@551d559a 0:12:alice 12:12:alice.txt 24:10:bob.txt ./dir2 56ac2557b1ded11ccab7293dc47d1e88+44+Af6a5e1e6eee64efa20cddbc7920bf2955db78792@551d559a 0:27:alice.txt ./dir1 92b53930db60fe94be2a73fc771ba921+34+A9b99704178af2818f0581590e6997dd99f7ac4e4@551d559a 0:12:alice(1) 12:12:alice(1).txt 24:10:carol.txt ./dir2 56ac2557b1ded11ccab7293dc47d1e88+44+Af6a5e1e6eee64efa20cddbc7920bf2955db78792@551d559a 0:27:alice(1).txt . acbd18db4cc2f85cedef654fccc4a4d8+3+A53f7e7c7f8d8b902474fa78ceb2b9def18b19487@551d559a 0:3:foo . acbd18db4cc2f85cedef654fccc4a4d8+3+A53f7e7c7f8d8b902474fa78ceb2b9def18b19487@551d559a 0:3:foo(1) 4) Failure: ActionsControllerTest#test_combine_foo_files_from_two_different_collection_streams_and_expect_proper_filename_suffixes [/home/brett/repos/arvados/apps/workbench/test/controllers/actions_controller_test.rb:162]: Incorrect number of streams in . acbd18db4cc2f85cedef654fccc4a4d8+3+A41aa3b6158584ce0ceb94554d6f1faed7e0d6ff9@551d55a4 0:3:foo . a84b928ebdbae3f658518c711beaec02+28+Af25e2bad91a06afe446a297d255da0ce6ce4d19b@551d55a4 12:16:foo(1) . Expected: 1 Actual: 2
- Please make a method to register a new MIME type if it doesn't exist. I know the block is simple to copy and paste now, but I won't be surprised if we end up extending some of the conditional logic in the future. Making a method now will help with that.
- Why does the initializer register the types with Mime::Type? According to the comment, that's for Rails' use in responding to specific formats, and I don't believe we want to recognize any of these formats in responses.
- Why does the initializer
- Please make sure that the portable data hashes for the new Collection fixtures correspond to the manifest text. It helps make the fixtures more useful for other tests in the future if they already have good data. And it helps avoid questions about whether or not the portable data hash is "intentionally" wrong.
#9 Updated by Radhika Chippada over 6 years ago
Brett, thanks for the review comments. I addressed all the comments. (I was being lazy about the new fixture pdf; now that I updated it, there is now no conflict with the other action_controller tests.)
As for "include MIME", without it workbench does not find MIME and I cannot access any pages. I think it is got to do with the order of loading gems during initialization.
#10 Updated by Brett Smith over 6 years ago
At 4a8dc95, I just have one organizational thought. Following Rails convention, it would be ideal to test CollectionsHelper in a dedicated test case under
tests/helpers/, rather than mixing the tests with the collections unit tests. (I apologize for not noting this last time.) This would make it easier to find and extend the tests when doing future work on the helper.
Everything else, including the tests themselves, looks great. Thanks.
#14 Updated by Nancy Ouyang over 6 years ago
Looks great! I guess I wasn't very clear, but I'd find it very helpful for Workbench to display a few lines from the file inside the browser, regardless of how each of my browsers I have is is set up to handle the MIME type. I definitely don't have an in-browser applications setup to preview fasta files, so right now the "magnifying glass" button does the same thing as the "download button" for me -- it pops up a "download" dialog in my browser. I'd rather have a little preview of the text in the file, sort of how log files are displayed, or even just a raw monospaced text dump of the first 20 kb in a new tab is fine and super helpful.
#15 Updated by Brett Smith over 6 years ago
- Subject changed from [workbench] enable in-browser preview for text files in collections to [workbench] Enable in-browser preview for text files in collections
- Status changed from Resolved to New
- Assigned To deleted (
- Target version changed from 2015-04-01 sprint to Arvados Future Sprints
Reopening to address the above. See also #5511, where a user requested the ability to preview VCF files.