Feature #5375
closed[workbench] Enable in-browser preview for text files in collections
Description
The magnifying class should be enabled --
Just displaying the first 100kb is fine.
https://github.com/github/linguist/blob/master/lib/linguist/languages.yml
I suspect a subset will do initially, like
py, R, sh, xsl,
md, yml
txt, json, xml, tsv, csv
fa, fasta, sam, (not bam), vcf
Most specifically, I want support for *.fa and *.fasta
https://workbench.qr1hi.arvadosapi.com/collections/qr1hi-4zz18-7zk4muy5grnaqpv
Related issues
Updated by Radhika Chippada over 9 years ago
- Category set to Workbench
- Target version set to Bug Triage
Updated by Tom Clegg over 9 years ago
- Tracker changed from Bug to Feature
- Story points set to 0.5
Updated by Tom Clegg over 9 years ago
- Target version changed from Bug Triage to 2015-04-01 sprint
Updated by Radhika Chippada over 9 years ago
- Assigned To set to Radhika Chippada
Updated by Tom Clegg over 9 years ago
- its content type is "text/*" or "image/*"
- its content type is "application/X" where X is something we whitelist, like pdf
Updated by Radhika Chippada over 9 years ago
Notes about implementation:
- The gem mime_types supports 1868 mime types. The data file is within the gem. It is currently not supporting just 5 of the types listed by Nancy. These are: fa, fasta, go, r, sam
- Added these unsupported types to mime_types using a rails initializer. These are added as "application" types.
- Added a helper utility that gets the mime type using MIME.
- If the file is text or image per the MIME type, true is returned.
- On the other hand, if it is an application type, true is returned if it is white listed using the config parameter "filename_suffixes_with_view_icon"
- Updated collections/_show_files.html to use the helper utility
- Added tests to verify that each of the type listed in the story description is viewable.
Updated by Radhika Chippada over 9 years ago
- Status changed from New to In Progress
Updated by Brett Smith over 9 years ago
Reviewing a9c8f2d
There should be a one-to-one mapping between file formats, and MIME types. Several different file extensions may map to the same MIME type because they refer to the same format (e.g., .tgz
and .tar.gz
)—and that's good, because it means that we can more easily identify files without sweating the details of the extension. This feature is why it would be helpful to rely on MIME type information exclusively to identify files, and get away from any extension parsing and recognition code.
Along those lines:
- According to Wikipedia, the extensions
.fasta
,.fas
,.fa
,.seq
, and.fsa
can all refer to FASTA files. All of these should be registered as the same MIME type—probablyapplication/fasta
, as in your first registration. - It would be good to replace the configuration setting
filename_suffixes_with_view_icon
with something likeapplication_mimetypes_with_view_icon
. This would be a list of MIME subtypes underapplication/
that are viewable in most browsers. This way, we can just list FASTA files once with a name likefasta
rather than listingfa
andfasta
(and the other extensions) separately. It also makes it easier for us to handle file names that have more than one extension.
Other comments:
- With run-tests.sh, I got the following test failures:
2) Failure: ActionsControllerTest#test_combine_files_into_new_collection [/home/brett/repos/arvados/apps/workbench/test/controllers/actions_controller_test.rb:39]: Not found 0:0:file1 0:0:file2 0:0:file3 in new collection manifest text 3) Failure: ActionsControllerTest#test_combine_files_with_repeated_names_into_new_collection [/home/brett/repos/arvados/apps/workbench/test/controllers/actions_controller_test.rb:69]: Not found 0:0:file1 0:0:file2 0:0:file3 in new collection manifest text
If I run just that test suite, I get a couple more:
1) Failure: ActionsControllerTest#test_combine_collections_with_repeated_filenames_in_almost_similar_directories_and_expect_files_with_proper_suffixes [/home/brett/repos/arvados/apps/workbench/test/controllers/actions_controller_test.rb:100]: Not found: alice(1) in dir1 in manifest text ./dir1 92b53930db60fe94be2a73fc771ba921+34+A9b99704178af2818f0581590e6997dd99f7ac4e4@551d559a 0:12:alice 12:12:alice.txt 24:10:bob.txt ./dir2 56ac2557b1ded11ccab7293dc47d1e88+44+Af6a5e1e6eee64efa20cddbc7920bf2955db78792@551d559a 0:27:alice.txt ./dir1 92b53930db60fe94be2a73fc771ba921+34+A9b99704178af2818f0581590e6997dd99f7ac4e4@551d559a 0:12:alice(1) 12:12:alice(1).txt 24:10:carol.txt ./dir2 56ac2557b1ded11ccab7293dc47d1e88+44+Af6a5e1e6eee64efa20cddbc7920bf2955db78792@551d559a 0:27:alice(1).txt . acbd18db4cc2f85cedef654fccc4a4d8+3+A53f7e7c7f8d8b902474fa78ceb2b9def18b19487@551d559a 0:3:foo . acbd18db4cc2f85cedef654fccc4a4d8+3+A53f7e7c7f8d8b902474fa78ceb2b9def18b19487@551d559a 0:3:foo(1) 4) Failure: ActionsControllerTest#test_combine_foo_files_from_two_different_collection_streams_and_expect_proper_filename_suffixes [/home/brett/repos/arvados/apps/workbench/test/controllers/actions_controller_test.rb:162]: Incorrect number of streams in . acbd18db4cc2f85cedef654fccc4a4d8+3+A41aa3b6158584ce0ceb94554d6f1faed7e0d6ff9@551d55a4 0:3:foo . a84b928ebdbae3f658518c711beaec02+28+Af25e2bad91a06afe446a297d255da0ce6ce4d19b@551d55a4 12:16:foo(1) . Expected: 1 Actual: 2
- Please make a method to register a new MIME type if it doesn't exist. I know the block is simple to copy and paste now, but I won't be surprised if we end up extending some of the conditional logic in the future. Making a method now will help with that.
- Why does the initializer register the types with Mime::Type? According to the comment, that's for Rails' use in responding to specific formats, and I don't believe we want to recognize any of these formats in responses.
- Why does the initializer
include MIME
? - Please make sure that the portable data hashes for the new Collection fixtures correspond to the manifest text. It helps make the fixtures more useful for other tests in the future if they already have good data. And it helps avoid questions about whether or not the portable data hash is "intentionally" wrong.
Thanks.
Updated by Radhika Chippada over 9 years ago
Brett, thanks for the review comments. I addressed all the comments. (I was being lazy about the new fixture pdf; now that I updated it, there is now no conflict with the other action_controller tests.)
As for "include MIME", without it workbench does not find MIME and I cannot access any pages. I think it is got to do with the order of loading gems during initialization.
Thanks.
Updated by Brett Smith over 9 years ago
At 4a8dc95, I just have one organizational thought. Following Rails convention, it would be ideal to test CollectionsHelper in a dedicated test case under tests/helpers/
, rather than mixing the tests with the collections unit tests. (I apologize for not noting this last time.) This would make it easier to find and extend the tests when doing future work on the helper.
Everything else, including the tests themselves, looks great. Thanks.
Updated by Radhika Chippada over 9 years ago
- Status changed from In Progress to Resolved
- % Done changed from 0 to 100
Applied in changeset arvados|commit:10d3f91d4b5619172dd5d8fbf3e8aa863b673e2f.
Updated by Radhika Chippada over 9 years ago
- Status changed from Resolved to In Progress
Also added javascript and rtf application types to the list of white-listed applications.
Updated by Radhika Chippada over 9 years ago
- Status changed from In Progress to Resolved
Updated by Nancy Ouyang over 9 years ago
Looks great! I guess I wasn't very clear, but I'd find it very helpful for Workbench to display a few lines from the file inside the browser, regardless of how each of my browsers I have is is set up to handle the MIME type. I definitely don't have an in-browser applications setup to preview fasta files, so right now the "magnifying glass" button does the same thing as the "download button" for me -- it pops up a "download" dialog in my browser. I'd rather have a little preview of the text in the file, sort of how log files are displayed, or even just a raw monospaced text dump of the first 20 kb in a new tab is fine and super helpful.
Updated by Brett Smith over 9 years ago
- Subject changed from [workbench] enable in-browser preview for text files in collections to [workbench] Enable in-browser preview for text files in collections
- Status changed from Resolved to New
- Assigned To deleted (
Radhika Chippada) - Target version changed from 2015-04-01 sprint to Arvados Future Sprints
Reopening to address the above. See also #5511, where a user requested the ability to preview VCF files.
Updated by Peter Amstutz over 3 years ago
- Target version deleted (
Arvados Future Sprints)