Project

General

Profile

GATK Queue support » History » Version 2

Brett Smith, 03/10/2016 03:53 PM

1 1 Brett Smith
h1. GATK Queue support
2
3 2 Brett Smith
h2. Use
4 1 Brett Smith
5 2 Brett Smith
We have code that integrates GATK Queue with Arvados, creating child jobs for each work unit spawned by GATK Queue.  You invoke it following the usual documentation for GATK Queue.  Just make sure you're using a jar that includes the Arvados integration.
6
7
For example, to run IndelRealigner with GATK Queue using run-command, your @command@ parameter will look like:
8
9
<pre><code class="javascript">["java", "-jar", "$(dir $(queue))/Queue.jar", "--script", "$(dir $(scala_script))/ArvadosIndelRealigner.scala", more...]
10
</code></pre>
11
12
The GATK Queue integration creates and monitors child jobs for chunks of work that GATK Queue organizes.  The integration looks at a *script parameter* named @runtime_constraints@ to get the runtime constraints for each component.  (Do *not* put these in the main @runtime_constraints@ for the component; they may confuse Arvados.)  The relevant part of your pipeline template will look like this:
13
14
<pre><code class="javascript">"script_parameters": {
15
  "runtime_constraints": {
16
    "value": {
17
      "HaplotypeCaller":        { "min_cores_per_node": N1, more... },
18
      "RealignerTargetCreator": { "min_cores_per_node": N2, more... },
19
      "IndelRealigner":         { "min_cores_per_node": N3, more... },
20
      "CatVariants":            { "min_cores_per_node": N4, more... },
21
      "MergeSamFiles":          { "min_cores_per_node": N5, more... },
22
      "GenotypeGVCFs":          { "min_cores_per_node": N6, more... },
23
      "SelectVariants":         { "min_cores_per_node": N7, more... },
24
      "VariantFiltration":      { "min_cores_per_node": N8, more... },
25
      "CombineVariants":        { "min_cores_per_node": N9, more... }
26
    }
27
  },
28
  more...
29
}
30
</code></pre>
31
32
Not every component is relevant to every job, so you don't need to specify each one, but this is the complete list of what our GATK Queue integration currently supports and recognizes.  If you don't specify runtime constraints for a component, the GATK Queue integration won't either, so you'll usually get the smallest node size.
33
34
h2. Development
35
36
"Source is on GitHub":https://github.com/curoverse/gatk-protected
37
38 1 Brett Smith
Build process:
39
40
# Install Maven, dependencies, and plug-ins: @sudo aptitude install openjdk-7-jdk maven libmaven-jar-plugin-java libmaven-shared-jar-java libmaven-compiler-plugin-java@
41
# Build and install the Arvados Java SDK.  In @arvados/sdk/java@, run: @mvn package -Dmaven.test.skip=true && mvn install -Dmaven.test.skip=true@
42
# Build GATK Queue.  In @gatk-protected@, run: @mvn package@
43
44 2 Brett Smith
If it succeeds, you'll have @target/Queue.jar@.  @arv keep put@ that in a collection where you need it.