Project

General

Profile

Public Pipelines and Datasets » History » Version 2

Nancy Ouyang, 03/28/2015 05:26 AM
first draft

1 2 Nancy Ouyang
{{>toc}}
2
WARNING: WORK-IN-PROGRESS
3
4
h1. Public Pipelines
5
6
Bcbio-nextgen -- Brad Chapman
7
8
Bcbio-nextgen is a python toolkit providing best-practice pipelines for fully automated high throughput sequencing analysis.
9
10
Source Documentation: https://bcbio-nextgen.readthedocs.org/en/latest/
11
Arvados Documentation: https://arvados.org/projects/arvados/wiki/Bcbio-nextgen_tutorial
12
Link to Arvados Project: https://workbench.qr1hi.arvadosapi.com/projects/qr1hi-j7d0g-a45rn6qhjszct1d
13
14
GATK 2 Unified Genotyper
15
16
The Genome Analysis Toolkit or GATK is a software package developed at the Broad Institute to analyze high-throughput sequencing data. Raw reads are processed and the GATK UnifiedGenotyper is will call SNPs and indels.
17
18
Source Documentation: http://gatkforums.broadinstitute.org/discussion/17/gatk-2-0-announcement
19
Arvados Documentation: https://arvados.org/projects/arvados/wiki/GATK2_tutorial
20
Link to Arvados Project: https://workbench.qr1hi.arvadosapi.com/projects/qr1hi-j7d0g-iodpr5gc65cp4t0
21
22
GATK 3 Haplotype Caller
23
24
The Genome Analysis Toolkit or GATK is a software package developed at the Broad Institute to analyze high-throughput sequencing data. Raw reads are processed then, the GATK HaplotypeCaller will call SNPs and indels simultaneously via local re-assembly of haplotypes in an active region.
25
26
Source Documentation: https://www.broadinstitute.org/gatk/blog?tag=gatk3
27
Arvados Documentation: https://arvados.org/projects/arvados/wiki/GATK3_tutorial
28
Link to Arvados Project: https://workbench.qr1hi.arvadosapi.com/projects/qr1hi-j7d0g-l1emsbyyx6wyjdh
29
30
lobSTR 3
31
32
lobSTR is a tool for profiling Short Tandem Repeats (STRs) from high throughput sequencing data.
33
34
Source Documentation: http://melissagymrek.com/lobstr-code/
35
Arvados Documentation: https://arvados.org/projects/arvados/wiki/LobSTR_tutorial
36
Link to Arvados Project: https://workbench.qr1hi.arvadosapi.com/projects/qr1hi-j7d0g-up6qgpqz5ie2vfq
37
38
Platypus
39
40
Platypus is a tool designed for efficient and accurate variant-detection in high-throughput sequencing data. Platypus can detect SNPs, MNPs, short indels, replacements and (using the assembly option) deletions up to several kb.
41
42
Source Documentation: http://www.well.ox.ac.uk/platypus
43
Arvados Documentation: https://arvados.org/projects/arvados/wiki/Platypus_tutorial
44
Link to Arvados Project: https://workbench.qr1hi.arvadosapi.com/projects/qr1hi-j7d0g-xcntt7rtz251isk
45
46
Pathomap -- Ancestry Mapper -- Chris Mason
47
48
PathoMap is a research project by Weill Cornell Medical College to study the microbiome and metagenome of the built environment of NYC. The Ancestry Mapper pipeline is one pipeline out of the Pathomap suite, which takes human high-throughput sequencing data and creates an ancestry map.
49
50
Source Documentation: http://www.pathomap.org/about/
51
Arvados Documentation: https://arvados.org/projects/arvados/wiki/pathomap_tutorial/
52
Link to Arvados Project: https://workbench.qr1hi.arvadosapi.com/projects/qr1hi-j7d0g-662ij1pcw6bj8uj
53
54
Tuxedo Suite (BTC)
55
56
The Tuxedo Suite (Bowtie, Tophat, Cufflinks) takes High-throughput RNA sequencing data and assembles transcripts and estimates their relative abundances, without using a reference annotation.
57
58
Source Documentation:
59
http://bowtie-bio.sourceforge.net/bowtie2/index.shtml 
60
http://ccb.jhu.edu/software/tophat/index.shtml
61
http://cole-trapnell-lab.github.io/cufflinks/
62
Arvados Documentation:
63
Link to Arvados Project: https://workbench.qr1hi.arvadosapi.com/projects/qr1hi-j7d0g-ziejtm5kaik7ndl
64
65
VCF-compare
66
67
Compares positions in two VCF files and outputs concordance and discordance rates.
68
69
Source Documentation: http://vcftools.sourceforge.net/perl_module.html#vcf-compare
70
Arvados Documentation: https://arvados.org/projects/arvados/wiki/Vcf-compare_tutorial
71
Link to Arvados Project: https://workbench.qr1hi.arvadosapi.com/projects/qr1hi-j7d0g-qhf708xb2s7cow2
72
73
74
75
h2. All the tutorials
76
77
Each of the following tutorials introduces the following Arvados features:
78
79
* How to run the pipeline using Arvados
80
* How to access your pipeline results
81
* How to browse and select your input data for lobSTR and submit re-run the pipeline
82
83
https://arvados.org/projects/arvados/wiki/Platypus_tutorial
84
https://arvados.org/projects/arvados/wiki/gatk2_tutorial
85
https://arvados.org/projects/arvados/wiki/gatk3_tutorial
86
https://arvados.org/projects/arvados/wiki/bcbio-nextgen_tutorial
87
https://arvados.org/projects/arvados/wiki/Tuxedo_tutorial
88
https://arvados.org/projects/arvados/wiki/Pathomap_tutorial
89
90
91
h1. Public Datasets