Project

General

Profile

Storing and Organizing Data » History » Revision 6

Revision 5 (Tom Clegg, 02/27/2014 01:49 PM) → Revision 6/33 (Tom Clegg, 04/04/2014 01:54 PM)

h1. 2014-05-07 Organizing Stored Data 2014-03-26 Workflow sharing and dev tools 

 "Organizing data" demo outline Sprint theme: *When I do something useful, I want other people to understand and use it.* 

 Demo: (still rough!) 
 # Datasets get into my workspace (sftp upload e.g. Run existing pipeline (choose from sequencing provider to a special staging directory + API call to {something? ingestor app? arvados?}).  
 bundled pipelines: GATK, LobSTR, etc) 
 # My data gets into Edit the right project as specified by the uploader (API call)  
 pipeline (change version/arguments, add a component) 
 # Subscribe Optional: edit a crunch script, push to notifications (by email and/or Workbench dashboard): when files start/finish uploading; when files are shared with customer; when files are downloaded by third party git repo 
 # Move/copy collections between projects (Project RX1234, or Customer X’s files), tag them in destination project Run my new pipeline 
 # Share my new pipeline and my worked example (including data, tags, source code, etc) with all the appropriate string (e.g., sample ID) -- defaulting to existing tag used users in source project (e.g., provided at time of upload). the “my lab” group 
 # “Anyone As another lab member: log in, find the shared workflow, inspect details, run it myself with this secret link can view/download” mode. Enable, disable, change magic link. Use cases: browser + “wget -r”. my own input data. 
 # See log/overview of As initial user: explain who has accessed your shared data (incl. “anonymous user” if using secret-link-to-share); when shared/unshared; when each upload started/finished -- for a single project, can see the various things on this page, and across all projects why (because they're in some group? etc). 
 # As initial user: whoops, un-share.