Project

General

Profile

Actions

Bug #12593

closed

Unable to resolve localhost in GATK4 log4j setup

Added by Brad Chapman over 6 years ago. Updated over 6 years ago.

Status:
Closed
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
-
Story points:
-

Description

I'm running into an issue running bcbio workflows with GATK4, where they fail in a localhost lookup when setting up logging. This is an example run with a failed job:

https://cloud.curoverse.com/container_requests/qr1hi-xvhdp-r6aq106llhpnfg5#Log

Erroring out with:

Using GATK jar /usr/local/share/bcbio-nextgen/anaconda/share/gatk4-4.0b6-0/gatk-package-4.beta.6-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=1 -Xms500m -Xmx45864m -Djava.io.tmpdir=/var/spool/cwl/bcbiotx/tmpJVk4FE -jar /usr/local/share/bcbio-nextgen/anaconda/share/gatk4-4.0b6-0/gatk-package-4.beta.6-local.jar BaseRecalibratorSpark -I /keep/4261647a740c6eb75fea9494de0bf667+5868/NA12878-sort.bam --sparkMaster local[16] --output /var/spool/cwl/bcbiotx/tmpJVk4FE/NA12878-sort-recal.grp --reference /keep/38a3166acddf30ff581c249ece68e7f5+47411/collections/hg38/ucsc/hg38.2bit --conf spark.local.dir=/var/spool/cwl/bcbiotx/tmpJVk4FE --knownSites /keep/349e8c8ef6d90edc7a4a43153d160950+2339/dbsnp-147.vcf.gz -L /keep/0211312cfad709cd84f418f8749671a5+1388/bedprep/Exome-AZ_V2_pluschr20-hg38.bed --interval_set_rule INTERSECTION
ERROR Could not determine local host name java.net.UnknownHostException: 8b29e178684a: 8b29e178684a: Temporary failure in name resolution
at java.net.InetAddress.getLocalHost(InetAddress.java:1505)
at org.apache.logging.log4j.core.util.NetUtils.getLocalHostname(NetUtils.java:53)
at org.apache.logging.log4j.core.LoggerContext.setConfiguration(LoggerContext.java:486)
at org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:562)
at org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:578)
at org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:214)
at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:145)
at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:41)
at org.apache.logging.log4j.LogManager.getContext(LogManager.java:182)
at org.apache.logging.log4j.LogManager.getLogger(LogManager.java:455)
at org.broadinstitute.hellbender.utils.Utils.<clinit>(Utils.java:72)
at org.broadinstitute.hellbender.Main.<clinit>(Main.java:43)
Caused by: java.net.UnknownHostException: 8b29e178684a: Temporary failure in name resolution
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
at java.net.InetAddress.getLocalHost(InetAddress.java:1500)

Thanks for any tips or tricks to avoid the issue.

Actions #1

Updated by Nico César over 6 years ago

I see that the current version of log4j does a try/catch on that particular line (look like android devices fail too):

https://logging.apache.org/log4j/2.0/log4j-core/apidocs/src-html/org/apache/logging/log4j/core/LoggerContext.html#line.539

(with a different exception BTW: https://issues.apache.org/jira/browse/LOG4J2-719 )

but the config variable seems to be named "hostName" and according to http://logging.apache.org/log4j/2.x/manual/lookups.html#Log4jConfigLookup this is retrived as ${log4j:hostName} I wonder if appending -Dlog4j.hostName=localhost to the execution will make log4j use localhost as hostname

Actions #2

Updated by Tom Clegg over 6 years ago

I suspect this is a side effect of disabling networking in the container environment, which Arvados does by default.

I tried re-running this job with networking enabled (by adding "API":true to runtime_constraints; can be done via arv:APIRequirement: {} in cwl) and it seems to have progressed well past this failure point.

FWIW, a newer (current?) version of log4j seems to warn about this situation and continue, rather than crashing. https://logging.apache.org/log4j/2.x/log4j-core/apidocs/src-html/org/apache/logging/log4j/core/util/NetUtils.html

Actions #3

Updated by Brad Chapman over 6 years ago

Nico and Tom -- thanks so much for digging into this. This is really helpful. log4j bundled with the GATK jar so I don't have an easy way to update it but that's helpful to know it would do a better job going forward at some point.

Is it possible to enable networking from CWL right now? That seems like the fastest workaround if it's possible. If not, I could explore adjusting log4j.hostName on the GATK command line.

Actions #4

Updated by Brad Chapman over 6 years ago

Sorry I shouldn't read this in a meeting: `arv:APIRequirement` to the `requirements`. Got it. I'll give that a try and report back. Thanks again.

Actions #5

Updated by Brad Chapman over 6 years ago

  • Status changed from New to Closed

Nico and Tom -- thanks again for the help. The `arv:APIRequirement` trick seems to have done it and I can now progress past the point I was failing. I've added this into bcbio CWL generation and will keep working on getting the pipeline running. Thank you again for the help.

Actions

Also available in: Atom PDF