Storing and Organizing Data » History » Version 18
Tom Clegg, 04/09/2014 01:52 PM
1 | 7 | Tom Clegg | h1. Storing and Organizing Data |
---|---|---|---|
2 | 3 | Tom Clegg | |
3 | 7 | Tom Clegg | Rough demo outline |
4 | 3 | Tom Clegg | |
5 | 8 | Tom Clegg | # Automatic ingest from a POSIX directory to Keep |
6 | 12 | Tom Clegg | #* Access to existing staging area (e.g., remote NFS share) is arranged ahead of time as an admin/setup task |
7 | 11 | Tom Clegg | #* Optional(?) User can manage staging areas hosted inside Arvados |
8 | 18 | Tom Clegg | #* Someone ("3rd-party") uploads some files to the staging area via SFTP or whatever |
9 | #* 3rd-party does an API call to {something - ingestor app? directly to arvados api endpoint?}. In the API call, the uploader provides a tag (e.g., a sample ID) and a list of files, checksums, etc. |
||
10 | 13 | Tom Clegg | #* Ingestor daemon reads the data from the staging area and writes it into Keep; creates one collection per API call made by uploader |
11 | 9 | Tom Clegg | #* In Workbench the imported Datasets appear as Collections in the designated project |
12 | 15 | Tom Clegg | #* After data has been copied into Keep, ingestor deletes the files from the staging area (this had better be configurable!). |
13 | 17 | Tom Clegg | ... |
14 | # My data gets into the right project as specified by the uploader (API call) |
||
15 | #* How is the staging-area ↔ project mapping specified, and how/where is it encoded/stored? |
||
16 | ... |
||
17 | 1 | # Subscribe to notifications (by email and/or Workbench dashboard): when files start/finish uploading; when files are shared with customer; when files are downloaded by third party |
|
18 | 17 | Tom Clegg | #* For now, use existing Logs table + automatic logging of create/update/delete operations |
19 | ... |
||
20 | 1 | # Move/copy collections between projects (Project RX1234, or Customer X’s files), tag them in destination project with the appropriate string (e.g., sample ID) -- defaulting to existing tag used in source project (e.g., provided at time of upload). |
|
21 | 17 | Tom Clegg | #* UI for presenting Groups as Projects/Folders: create, view, rename, share, delete |
22 | #* UI for copying/moving objects between folders |
||
23 | #* How to avoid confusion about "is this one object in two places, or are there two objects?" Note GDocs has a bit of both, "My Drive" / "Shared with me" vs. regular folders |
||
24 | ... |
||
25 | 1 | # “Anyone with this secret link can view/download” mode. Enable, disable, change magic link. Use cases: browser + “wget -r”. |
|
26 | #* Perhaps the secret in the secret link is an ApiClientAuthorization token, belonging to the person creating the link, scoped to a single project/collection |
||
27 | 17 | Tom Clegg | #* How do we implement "Anonymous user, not logged in"? |
28 | ... |
||
29 | 6 | Tom Clegg | # See log/overview of who has accessed your shared data (incl. “anonymous user” if using secret-link-to-share); when shared/unshared; when each upload started/finished -- for a single project, and across all projects |