Project

General

Profile

Arvados Summit Fall 2013 Breakout 1 » History » Version 1

Jonathan Sheffi, 10/25/2013 02:43 PM

1 1 Jonathan Sheffi
h1. Arvados Summit Fall 2013 Breakout 1 
2
3
h2. User stories (Jonathan & Ward facilitating)
4
5
* As an admin, if I change my DB structure, I want Arvados to help me update the config
6
* As an admin, I want to see the mapping of another dataset to my own
7
* When I run a job, I want to be able to work as Draft or Final/Real results
8
* As a consumer of genomic data, I want to visualize my data
9
* As a commercial leader of a clinical lab, I want to be able to trace quote to cash for diagnostic tests
10
* I want to be able to know where any file is.
11
* As a patient or participant, I want to be able to export my data to another study.
12
* As someone who works with data, I want the genotypic and phenotypic data I use to conform to a standard ontology.
13
* As a clinician, I want to quantify the uncertainty of the data & analysis underlying my report, so that I and the patient understand the clinical decision more fully.
14
* As a clinician, I want to run the same experiment on multiple data sets.
15
* As a lab director and oncologist, I want exome raw reads to called variants to take 15 minutes.
16
* As a data miner, I want to be able to query *all* public data without downloading it.
17
* As a researcher, I want to be able to set up a standard pipeline for a particular type of data set.
18
* As an informatician, I want all my data to conform to a standard format so that I can analyze across multiple data sets.
19
* As a clinician, I want to collect & track inbound case data, such as referral letters, ICD-9 diagnosis codes, case summaries, consents, medical reports, and insurance pre-verifications.
20
* As an informatician, I want to be able to track & manage ICD-9/10 data.
21
* As a lab director or clinician, I want to share a report with another clinician at another institution.
22
* As a clinician, if I discover a mutation, I want to share that with an analytical tool or aggregator of data (e.g. GeneInsight).
23
* As a user, I want to associate ‘keepalive’ metadata to my intermediate data
24
* As Arvados, I record profiling information that data expiration for intermediate data can be based on
25
* As an informatician, I can easily manipulate VCF files in parallel (as easy as GNV parallel)
26
* As a compliance officer, I have structured insight into the consents for my data
27
* As a researcher, I want to be able to collaborate on big datasets without having to copy them.
28
* As an informatician, I want to associate metadata with (a section of) my pipelines.
29
* As a new user, I can browse pipelines for metadata, see how ‘popular’ datasets and pipelines are [‘social features’]
30
31
h2. Technical discussion (Tom facilitating)
32
33
* Test for functionality
34
* Documentation
35
** What can Keep do?
36
** High-level functional description
37
** How would one replace an existing storage system with Keep?
38
** How to migrate?
39
** How to MapReduce?
40
** Examples
41
* Databases as input to job
42
* Permissions
43
* Audit trail
44
* Prioritizing jobs - squeaky wheel
45
* Monitoring - activity & status
46
* Checkpointing
47
* Self-starter kit