Feature #8381
closedbcbio support for variant calling in CWL
Description
Add support for variant calling with bcbio into CWL generation. We currently support parallel alignment and including variant calling would enable GATK best practices pipelines and VarDict/somatic integration in coordination with current work from Tom and Sally.
Requirements:
- Batching of samples to allow pooled or tumor/normal calling. Need to represent in CWL.
- Parallel runs of batches across genomic regions with subsequent merging.
Updated by Brad Chapman almost 9 years ago
- Status changed from New to In Progress
Progress so far:
- bcbio batches samples together into groups (tumor/normal or family calling) using CWL records.
- Submitted PR to cwltool that enables grouping after discussion with Peter: https://github.com/common-workflow-language/cwltool/pull/40
- We can split batches based on genomic regions to run in parallel.
- Variant calling runs in parallel on defined regions.
- Merge variant calls back into single VCF.
To do steps, which could be a new story:
- Post-calling filtering of VCFs
- Additional VCF annotations for effects (snpEff)
Updated by Brad Chapman almost 9 years ago
- Status changed from In Progress to Resolved
Finalized post-call filtering of VCFs and am punting snpEff annotations for now since that is a bit more complex to link in all of the associated snpEff data files.
This puts us in place to have a more complete demonstration CWL with variant calling for #8176. I will put together documentation and a new example file as part of that story.
This allows us to move on to validation (#8382) which we could schedule for the next sprint.