Arvados Summit Fall 2013 Breakout 1 » History » Version 1
Jonathan Sheffi, 10/25/2013 02:43 PM
1 | 1 | Jonathan Sheffi | h1. Arvados Summit Fall 2013 Breakout 1 |
---|---|---|---|
2 | |||
3 | h2. User stories (Jonathan & Ward facilitating) |
||
4 | |||
5 | * As an admin, if I change my DB structure, I want Arvados to help me update the config |
||
6 | * As an admin, I want to see the mapping of another dataset to my own |
||
7 | * When I run a job, I want to be able to work as Draft or Final/Real results |
||
8 | * As a consumer of genomic data, I want to visualize my data |
||
9 | * As a commercial leader of a clinical lab, I want to be able to trace quote to cash for diagnostic tests |
||
10 | * I want to be able to know where any file is. |
||
11 | * As a patient or participant, I want to be able to export my data to another study. |
||
12 | * As someone who works with data, I want the genotypic and phenotypic data I use to conform to a standard ontology. |
||
13 | * As a clinician, I want to quantify the uncertainty of the data & analysis underlying my report, so that I and the patient understand the clinical decision more fully. |
||
14 | * As a clinician, I want to run the same experiment on multiple data sets. |
||
15 | * As a lab director and oncologist, I want exome raw reads to called variants to take 15 minutes. |
||
16 | * As a data miner, I want to be able to query *all* public data without downloading it. |
||
17 | * As a researcher, I want to be able to set up a standard pipeline for a particular type of data set. |
||
18 | * As an informatician, I want all my data to conform to a standard format so that I can analyze across multiple data sets. |
||
19 | * As a clinician, I want to collect & track inbound case data, such as referral letters, ICD-9 diagnosis codes, case summaries, consents, medical reports, and insurance pre-verifications. |
||
20 | * As an informatician, I want to be able to track & manage ICD-9/10 data. |
||
21 | * As a lab director or clinician, I want to share a report with another clinician at another institution. |
||
22 | * As a clinician, if I discover a mutation, I want to share that with an analytical tool or aggregator of data (e.g. GeneInsight). |
||
23 | * As a user, I want to associate ‘keepalive’ metadata to my intermediate data |
||
24 | * As Arvados, I record profiling information that data expiration for intermediate data can be based on |
||
25 | * As an informatician, I can easily manipulate VCF files in parallel (as easy as GNV parallel) |
||
26 | * As a compliance officer, I have structured insight into the consents for my data |
||
27 | * As a researcher, I want to be able to collaborate on big datasets without having to copy them. |
||
28 | * As an informatician, I want to associate metadata with (a section of) my pipelines. |
||
29 | * As a new user, I can browse pipelines for metadata, see how ‘popular’ datasets and pipelines are [‘social features’] |
||
30 | |||
31 | h2. Technical discussion (Tom facilitating) |
||
32 | |||
33 | * Test for functionality |
||
34 | * Documentation |
||
35 | ** What can Keep do? |
||
36 | ** High-level functional description |
||
37 | ** How would one replace an existing storage system with Keep? |
||
38 | ** How to migrate? |
||
39 | ** How to MapReduce? |
||
40 | ** Examples |
||
41 | * Databases as input to job |
||
42 | * Permissions |
||
43 | * Audit trail |
||
44 | * Prioritizing jobs - squeaky wheel |
||
45 | * Monitoring - activity & status |
||
46 | * Checkpointing |
||
47 | * Self-starter kit |