Project

General

Profile

Public Pipelines and Datasets » History » Revision 3

Revision 2 (Nancy Ouyang, 03/28/2015 05:26 AM) → Revision 3/6 (Radhika Chippada, 06/09/2015 03:30 PM)

{{toc}} 

 {{>toc}} 
 WARNING: WORK-IN-PROGRESS 

 h1. Public Pipelines 

 h2. Bcbio-nextgen 

 By: -- Brad Chapman 

 Bcbio-nextgen is a python toolkit providing best-practice pipelines for fully automated high throughput sequencing analysis. 

 p(. Source Documentation: https://bcbio-nextgen.readthedocs.org/en/latest/ 

 p(. 
 Arvados Documentation: https://arvados.org/projects/arvados/wiki/Bcbio-nextgen_tutorial 

 p(. 
 Link to Arvados Project: https://workbench.qr1hi.arvadosapi.com/projects/qr1hi-j7d0g-a45rn6qhjszct1d 

 h2. GATK 2 Unified Genotyper 

 The Genome Analysis Toolkit or GATK is a software package developed at the Broad Institute to analyze high-throughput sequencing data. Raw reads are processed and the GATK UnifiedGenotyper is will call SNPs and indels. 

 p(. Source Documentation: http://gatkforums.broadinstitute.org/discussion/17/gatk-2-0-announcement 

 p(. 
 Arvados Documentation: https://arvados.org/projects/arvados/wiki/GATK2_tutorial 

 p(. 
 Link to Arvados Project: https://workbench.qr1hi.arvadosapi.com/projects/qr1hi-j7d0g-iodpr5gc65cp4t0 

 h2. GATK 3 Haplotype Caller 

 The Genome Analysis Toolkit or GATK is a software package developed at the Broad Institute to analyze high-throughput sequencing data. Raw reads are processed then, the GATK HaplotypeCaller will call SNPs and indels simultaneously via local re-assembly of haplotypes in an active region. 

 p(. Source Documentation: https://www.broadinstitute.org/gatk/blog?tag=gatk3 

 p(. 
 Arvados Documentation: https://arvados.org/projects/arvados/wiki/GATK3_tutorial 

 p(. 
 Link to Arvados Project: https://workbench.qr1hi.arvadosapi.com/projects/qr1hi-j7d0g-l1emsbyyx6wyjdh 

 h2. lobSTR 3 

 lobSTR is a tool for profiling Short Tandem Repeats (STRs) from high throughput sequencing data. 

 p(. Source Documentation: http://melissagymrek.com/lobstr-code/ 

 p(. 
 Arvados Documentation: https://arvados.org/projects/arvados/wiki/LobSTR_tutorial 

 p(. 
 Link to Arvados Project: https://workbench.qr1hi.arvadosapi.com/projects/qr1hi-j7d0g-up6qgpqz5ie2vfq 

 h2. Platypus 

 Platypus is a tool designed for efficient and accurate variant-detection in high-throughput sequencing data. Platypus can detect SNPs, MNPs, short indels, replacements and (using the assembly option) deletions up to several kb. 

 p(. Source Documentation: http://www.well.ox.ac.uk/platypus 

 p(. 
 Arvados Documentation: https://arvados.org/projects/arvados/wiki/Platypus_tutorial 

 p(. 
 Link to Arvados Project: https://workbench.qr1hi.arvadosapi.com/projects/qr1hi-j7d0g-xcntt7rtz251isk 

 h2. Pathomap -- Ancestry Mapper 

 By: -- Chris Mason 

 PathoMap is a research project by Weill Cornell Medical College to study the microbiome and metagenome of the built environment of NYC. The Ancestry Mapper pipeline is one pipeline out of the Pathomap suite, which takes human high-throughput sequencing data and creates an ancestry map. 

 p(. Source Documentation: http://www.pathomap.org/about/ 

 p(. 
 Arvados Documentation: https://arvados.org/projects/arvados/wiki/pathomap_tutorial/ 

 p(. 
 Link to Arvados Project: https://workbench.qr1hi.arvadosapi.com/projects/qr1hi-j7d0g-662ij1pcw6bj8uj 

 h2. Tuxedo Suite (BTC) 

 The Tuxedo Suite (Bowtie, Tophat, Cufflinks) takes High-throughput RNA sequencing data and assembles transcripts and estimates their relative abundances, without using a reference annotation. 

 p(. Source Documentation: 

 p(((. 
 http://bowtie-bio.sourceforge.net/bowtie2/index.shtml 

 p(((.  
 http://ccb.jhu.edu/software/tophat/index.shtml 

 p(((. 
 http://cole-trapnell-lab.github.io/cufflinks/ 

 p(. 
 Arvados Documentation: 

 p(. 
 Link to Arvados Project: https://workbench.qr1hi.arvadosapi.com/projects/qr1hi-j7d0g-ziejtm5kaik7ndl 

 h2. VCF-compare 

 Compares positions in two VCF files and outputs concordance and discordance rates. 

 p(. Source Documentation: http://vcftools.sourceforge.net/perl_module.html#vcf-compare 

 p(. 
 Arvados Documentation: https://arvados.org/projects/arvados/wiki/Vcf-compare_tutorial 

 p(. 
 Link to Arvados Project: https://workbench.qr1hi.arvadosapi.com/projects/qr1hi-j7d0g-qhf708xb2s7cow2 


 



 h2. All the tutorials 

 Each of the following tutorials introduces the following Arvados features: 

 * How to run the pipeline using Arvados 
 * How to access your pipeline results 
 * How to browse and select your input data for lobSTR and submit re-run the pipeline 

 p(. https://arvados.org/projects/arvados/wiki/Platypus_tutorial 

 p(. 
 https://arvados.org/projects/arvados/wiki/gatk2_tutorial 

 p(. 
 https://arvados.org/projects/arvados/wiki/gatk3_tutorial 

 p(. 
 https://arvados.org/projects/arvados/wiki/bcbio-nextgen_tutorial 

 p(. 
 https://arvados.org/projects/arvados/wiki/Tuxedo_tutorial 

 p(. 
 https://arvados.org/projects/arvados/wiki/Pathomap_tutorial 


 h1. Public Datasets (TBD)