Project

General

Profile

Actions

Bug #4049

closed

[Crunch] typo in one of the names of a job for qr1hi-d1hrv-ze9iyb3ckerqasw causes last job to wait forever for an input. High server load ensues.

Added by Ward Vandewege over 9 years ago. Updated over 9 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
-
Story points:
-

Description

Crunch-dispatch (the one --jobs mode) is being ridiculous and keeps committing the same git commits. This also generates tons of useless entries in the logs table.

There's also absurd amounts of traffic from workbench:

x.x.x.x - - [01/Oct/2014:02:25:09 +0000] "POST /arvados/v1/pipeline_templates HTTP/1.1" 200 41158 "-" "HTTPClient/1.0 (2.3.4.1, ruby 2.1.1 (2014-02-24))"
x.x.x.x - - [01/Oct/2014:02:25:09 +0000] "POST /arvados/v1/pipeline_instances HTTP/1.1" 200 129 "-" "HTTPClient/1.0 (2.3.4.1, ruby 2.1.1 (2014-02-24))"
x.x.x.x - - [01/Oct/2014:02:25:09 +0000] "POST /arvados/v1/collections HTTP/1.1" 200 678 "-" "HTTPClient/1.0 (2.3.4.1, ruby 2.1.1 (2014-02-24))"
x.x.x.x - - [01/Oct/2014:02:25:09 +0000] "POST /arvados/v1/links HTTP/1.1" 200 117 "-" "HTTPClient/1.0 (2.3.4.1, ruby 2.1.1 (2014-02-24))"
x.x.x.x - - [01/Oct/2014:02:25:09 +0000] "POST /arvados/v1/jobs HTTP/1.1" 200 116 "-" "HTTPClient/1.0 (2.3.4.1, ruby 2.1.1 (2014-02-24))"
x.x.x.x - - [01/Oct/2014:02:25:09 +0000] "POST /arvados/v1/collections HTTP/1.1" 200 643 "-" "HTTPClient/1.0 (2.3.4.1, ruby 2.1.1 (2014-02-24))"
x.x.x.x - - [01/Oct/2014:02:25:09 +0000] "POST /arvados/v1/links HTTP/1.1" 200 117 "-" "HTTPClient/1.0 (2.3.4.1, ruby 2.1.1 (2014-02-24))"
x.x.x.x - - [01/Oct/2014:02:25:09 +0000] "POST /arvados/v1/pipeline_templates HTTP/1.1" 200 8955 "-" "HTTPClient/1.0 (2.3.4.1, ruby 2.1.1 (2014-02-24))"
x.x.x.x - - [01/Oct/2014:02:25:09 +0000] "POST /arvados/v1/pipeline_templates HTTP/1.1" 200 8955 "-" "HTTPClient/1.0 (2.3.4.1, ruby 2.1.1 (2014-02-24))"
x.x.x.x - - [01/Oct/2014:02:25:10 +0000] "POST /arvados/v1/nodes HTTP/1.1" 200 65715 "-" "HTTPClient/1.0 (2.3.4.1, ruby 2.1.1 (2014-02-24))"
x.x.x.x - - [01/Oct/2014:02:25:10 +0000] "POST /arvados/v1/jobs/queue_size HTTP/1.1" 200 28 "-" "HTTPClient/1.0 (2.3.4.1, ruby 2.1.1 (2014-02-24))"

Actions #1

Updated by Ward Vandewege over 9 years ago

  • Description updated (diff)
Actions #2

Updated by Ward Vandewege over 9 years ago

  • Description updated (diff)
Actions #3

Updated by Peter Amstutz over 9 years ago

Job stall due to misspelled output_of bug is already on backlog as #3698

Running git over and over is fixed in #3168

#4004 includes fixes to reduce the number of API calls required to render the dashboard, but there's probably more work to be done there. Should check pipeline instance page to see if there is a need for similar improvements. Also briefly discussed on IRC making the default reload time for dashboard a configuration parameter and making the default a bit longer (maybe every 20 or 30 seconds instead of every 15).

Actions #4

Updated by Ward Vandewege over 9 years ago

  • Status changed from New to Resolved
  • Target version deleted (Bug Triage)

Thanks for that Peter, closing this bug since we already cover it in the three bugs you listed.

Actions

Also available in: Atom PDF