Project

General

Profile

UseCases » History » Revision 16

Revision 15 (Sarah Zaranek, 09/20/2022 03:22 PM) → Revision 16/27 (Sarah Zaranek, 10/05/2022 01:57 PM)

h2. h1. Use Cases for WB and Collections/Projects (Keep) Workbench 

 Note: these are not full use cases, simplified to just give more of the what and not a lot of the why or who.    We will be boiling these up into higher-level use cases.    This is more a list of items h2. Running/Testing Workflow that should be included in all those higher-level use cases.    We want to not only ensure these use cases can be done in WB2 but the experience is equivalent or hopefully better in WB2 than in WB1. I wrote  

 *Creating New Collections* 
 Summary: Users I want to create run a new collection in WB.  
 Details:    Wants to make collection in from any of the data sources: 
 From data already in an existing collection 
 From workflow on a subset of files in an existing collection that exists 
 By combining files from different subsets of different collections 
 By Uploading files from * did my desktop 
 From the results of a already workflow run workflow correctly? 
 By downloading data from another source into a collection 

 *Annotating Collections* 
 Summary: User wants to annotate their collections with such details as sample 
 name, sample type, sequencing method, species, etc.  
 Details: 
 User wants to add annotations/metadata to existing collection 
 User want to verify the added metadata to the collection 

 Examining Collection Files  
 Summary:    User is viewing a collection * if finished - check status, logs and wants to examine the files and extract the data or metadata.  
 Details: 
 User wants to find certain files within the collection  
 (visually,using search (basic and advanced search using regexp, etc) 
 User want to View or download certain files with the collection using a UI output 

 Sharing Collection Files  
 Summary:    User wants to share the Files in h2. Running a Collection 
     Details: 
 User wants make the collection available for downloading via ftp, s3 or other 3rd party application 
 User wants to share the collection with others (by name) in their organization, others via a defined group, to everyone in their organization, or publicly.     Users want those they share the collection with to have  
 Read permission.  
 Write permissions 
 Manage permissions 

 Finding Existing Collections 
 Summary: User want to find an existing collection 
 Details:    User wants to find the collection by one of the following ways: 
 By Browsing to the project Workflow that contains the collection 
 By searching the UUID of the project - and then looking through the items in the collection 
 By searching the UUID of the collection 
 By searching the collection name 
 By searching the PDH 
 By searching the collection metadata 
 By searching for the workflow I didn't write  

 * Assumption that produced it 
 By a URL that can share that points to the project on WB 

 Identifying Collections 
 Summary: User needs to ID the collection for use as inputs to a workflow and/or inputs to a command using the API or CLI 
 Details:    User wants to find these specific file IDs 
 User wants to get the collection UUID 
 User wants to get the collection PDH 
 User wants to get the collection metadata already works. 

 Editing an Existing Collection  
 Summary: User wants to modify a collection h2. Sharing Workflow I wrote for use in their new project. 
 Details: User will want someone else to 
 Copy over an existing collection to a new project 
 Remove files in their collection to suit their needs for their new bioinformatics task.  
 Add files from either their desktop or from another collection into this new collection run 

 Editing the File Structure of An Existing Collection  
 Summary:    User wants to adjust the file structure of an existing collection to work better as an input to their analysis tools 
 Details:    User will I want to 
 Remove files from folders to have them in all in a single flat directory 
 Separate different subsets of files and place them in folders 

 Verifying Correct Collection 
 Summary: User wants to download data from a collection, and want to make sure the collect they found is the correct collection 
 Details:    User may want to check the following: 
 UUID and/or PDH 
 metadata 
 file contents 
 collection “version” 
 lineage of the collection (i.e. if the collection was generated as the result of a workflow - which workflow created this collection) 

 Creating Projects 
 Summary: User is working in the Arvados Workbench and needs a new Arvados project for a new analysis project they are working on.  
 Details: The user wants to do the following 
 Create new project 
 Create a subproject within the new project (e.g. For Testing vs Final Runs) 
 Name the project/subproject 
 Add description of the project 
 Add metadata to the project   
 Extract UUID of Project for use in workflow inputs or various command line/SDK/API functionality.  
 Copy a URL that can share to others that points to the project 
 Mark the projects as “favorites” to be able to it more easily 

 Moving or Creating Items within Project 
 Summary. A User wants to set up their project to do their work.    They want to move all relevant existing data and workflows as well as upload new data and create new registered workflows.   
 Details: Users may want to: 
 Copy existing collections and register workflows into this project 
 Create new collections within the project 
 Create new registered workflows in that project 
 Run workflows in this projects having the output, logs and other created artifacts contained within this project 

 Archiving or Sharing a Project 
 Summary: Now that the person I work is finished and the user has the results they need.    They want to get the project ready for sharing with their organization, with those outside their organization or even possible publicly in a publication.   
 Details: Users may want to 
 Clean up the project by removing old collections, logs, processes and subdirectories that are not necessary to keep.  
 Edit the name or metadata for the project 
 Freeze the project 
 Share the project with others in their organization, to everyone in their organization, publicly.     These are possible configurations: 
 Others have read permission   
 Others have write permissions 
 Others have manage permissions 

 Identifying an Existing Project 
 Summary: User needs to identify the correct project in which to run their workflow.    They think they found the correct project they want searching this workflow I wrote on the project name.   
 Details: The user then may want to: 
 Find the UUID of the project 
 Examine aspects of the project to double check it is the project they want to use.    They might want to: 
 Check the contents of the project  
 Check the project description  
 User might want to see the history of the project (*not currently available) 
 User might want to see the metadata for the project 
 Check to see if the project is frozen 

 Finding a Project or Subproject 
 Summary: A user logs back into Arvados after a break and would like to find the project they were working on previously.  
 Details:    Those users would may want to find this project by 
 Searching for UUID, metadata or the project name 
 Navigating through the project/subproject hierarchy to find the project 
 Skimming through projects marked as their “Favorites” 
 Look for projects owned or created by a specific user (* Not Currently available) data 

 Use Cases for WB and h2. Debugging Workflows (Crunch) 

 Submitting Workflows to Arvados Without Command Line 
 Summary: User wants to submit I have a workflow to run on Arvados without having to use the command line. The workflow CWL file either lives in a git repository or on the user’s local machine or in a collection on Arvados.    (* Currently this is not available on WB) 
 Details: The user will want to indicate to Arvados which workflow they want to run and have Arvados  
 Parse the CWL file and generate an interactive form to fill out with input values and then submit the workflow via Workbench 
 Upload a YML file of input values to upload or direct Arvados to an existing YML file in an Arvados collection and then submit the workflow via Workbench.  

 Monitoring Submitted Workflows  
 Summary: User submits a workflow to run on Arvados using the command line. They want to monitor the workflow that they submitted using the Arvados Workbench.  
 Details: The User will want to 
 Find the running workflow 
 Through search 
 Via Project Navigation 
 Check the workflow’s current status 
 If run successfully, find output collection  
 If run successfully, look at how long it took to run 
 If run successfully, estimate costs (* currently only available for CommandLine) 

 Debugging Workflows - Part buggy/suspect results. I 
 Summary: User submits a workflow need to run on Arvados using the command line. They find out that the workflow and it did not run not run successfully.     They want to figure out why it failed.   
 Details: The User may wants to: 
 Examine logs 
 Examine inputs 
 Examine command 
 Examine CWL    (* CWL not yet available) 
 Check to see which docker container was used 
 Run crunch-run stats interactively on WB    (*Currently only available via where in the command line) or have those plots available on WB.  

 Debugging Workflows - Part II 
 Summary:    A User is updating a workflow workflow and pipeline the workflow has now stops running after a series of changes.    The User needs to compare the new workflow to the old workflow to see why it failed. bug crept in. 
 Details:    The User may need to  
 Examine old workflow runs and backtrack to see when was the workflow last working 
 Determine what is the difference between the last working workflow and the broken workflows main process . This could include comparing: Inputs, Command, Resource Allocation, Node Type, and Docker Image metadata. 
 Determine which steps (if any) were re-used from the working workflow 
 Determine which steps failed in the new workflows 
 Determine if * any big differences between the different workflows steps "foo_XX" steps? 
 Look at information * flags passed to the "foo_XX" steps including Inputs, Command, Resource Allocation 
 Find and compare * logs from one (all?) of the relevant workflow "bar_XX" steps Compare Docker 
 * docker image metadata (docker image ID, name, version, dockerfile) 
 Need to compare compare crunch-run stats between jobs (*Currently only available via command line) 

 Integrating Git Commit Information with Submitted Workflow 
 User has been working on changes h2. Finding    and Comparing Workflows  

 I need to an existing workflow.    They have been submitting workflows managed in find a git repository.    The "good" workflow stops working or returning different outputs. The User would like invocation, and compare it to  
 Find the relevant git commit information for the commit version of the workflow run on Arvados something that worked 
 Find the relevant git commit information for the version of the workflow that is returning different results or not isn't working 

 Calculating Workflow Costs 
 Summary: A User ran a big job and is now worried about how much it cost to run.    Details: The user wants to 
 Find the container request UUID of the * when did this workflow   
 Run the cost analyzer to see how much it cost to run the entire workflow  
 (* Currently not available in WB) 
 Run the cost analyzer to see which step is costing the most money 
 (* Currently not available in WB) 
 See if it would be possible to run workflow on less expensive instances (using information now only possible from the command line) 
 Estimate how much would it cost to run another similar workflow 
  (* Currently not available in Arvados) last succeed? 

 h2. Creating a Register Workflow Using Workbench 
 Summary: User wants collection 

 I want to create a registered workflow interactively using the WB (* Only currently available using command line). new collection 
 Details.    The user may want to: 
 Share registered workflow with others * from data already in their organization an existing collection 
 Set default values for the workflow 
 Define the name, metadata and descriptive text for the register workflow 

 Finding * from a Registered Workflow subset of files in an existing collection 
 Summary: User wants to run a registered workflow created by their colleague. User needs to find this workflow. * combining files from different subsets of different collections 
 Details: The user may want to find the workflow by * uploading from my desktop 
 Name, metadata * downloading from ftp, s3 or unique identifier other location 
 A URL shared by * from the creator results of the register a already run workflow 
 Looking Inside “project” that contains all the available shared workflows 

 Specifying Inputs to h2. Getting Collection Lineage 

 I have a Registered Workflow 
 Summary:    User wants collection that is valuable/interesting. I need to use the WB to submit review/share/publish how it was generated (tools/versions/options/inputs). 

 h2. Find a registered workflow made by  
 their collaque Collection 

 I need to run on Arvados .   
 Details: The user wants to: 
 Specify the inputs for the register workflow to use 
 Use the default inputs (if provided) 
 Identify the project the workflow should “run” in (i.e. where the outputs, logs find some data, and other collections should be stored) 
 See details about which registered workflow they are running to help guide them with to provide confirm I'm looking at the proper input (eg. Registered workflow , name, description, etc.) right collection.  

 h2. Cost Estimation 

 Examining A Running Workflow 
 Summary: User notices that their workflow has been running a long time and wants to check up on it.They would like to figure out if workflow is hung or is progressing along.    If progressing, the user may I want to try and figure out why the workflow is running slower than expected. 
 Details:    The User may want estimate how much money it will cost to examine the following on WB 
 Real time logs 
 Resources used 
 Steps run so far and their outputs/inputs 
 Real-time values of RAM, CPU usage etc currently  
 (* Not available in WB) 

  Rerunning Old Workflows   
 Summary: A User has successfully run a workflow on Arvados using the command line and would like to rerun that old workflow.  
 (*Not Currently Available in WB) 
 Details: User may 

 h2. Run-time Estimation 

 I want to specify: 
 New Inputs 
 New Resource Requirements 
 New Docker Container 

  Canceling a Running or Queued Workflow 
 Summary: User has submitted a workflow to Arvados via the command line and releases that they accidentally used the wrong inputs or an outdated function that the workflow leverages.    They want to cancel the workflow before know how long it wastes time and resources. 
 Details:    User would like will take to run my workflow. 
 Find the running workflow on WB * queued? when will it start? 
 Have an easy (single-button) way to cancel workflow on WB 
 Have * taking a way to verify that the workflow was canceled 

 Checking Workflow Inputs or Requirements long time? how long does it normally take? 
 Summary: User is examining output results of a workflow and wants to remind themselves which input parameter they used for the model they ran in the workflow.  
 Details:    User would like to 
 Trace output collection to process that created * slower than it (i.e. container) should be? should I run it with more CPUs, RAM, etc.? 
 Find and examine the inputs to the workflow or workflow step run in that container