Project

General

Custom queries

Profile

Actions

Bug #11238

open

job_task creation fails with ApiError - HttpError 422 - ActiveRecord::StatementInvalid: PG::InternalError: ERROR: invalid memory alloc request size 1718630765

Added by Joshua Randall about 8 years ago. Updated about 1 year ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
Story points:
-
Release:
Release relationship:
Auto

Description

For some reason some of our jobs are failing with an API server error indicating a postgres statement was invalid because it tried to malloc 1.7GB of RAM!?

From a job log:

2017-03-10_12:43:38 z8ta6-8i9sb-1u981ftt6rzlgbb 52742 151 stderr arvados.errors.ApiError: <HttpError 422 when requesting https://api.arvados.sanger.ac.uk/arvados/v1/job_tasks?alt=json returned "#<ActiveRecord::StatementInvalid: PG::InternalError: ERROR:  invalid memory alloc request size 1718630765

Actions #1

Updated by Joshua Randall about 8 years ago

These appear to be the corresponding log from the api server production.log (there does not appear to be any mention of the issue in our postgres logs):

[api.arvados.sanger] [42516ad406ffb11b68c6b7c720679436] WARNING: Can't verify CSRF token authenticity
[api.arvados.sanger] [42516ad406ffb11b68c6b7c720679436] #<ActiveRecord::StatementInvalid: PG::InternalError: ERROR:  invalid memory alloc request size 1718630765
[api.arvados.sanger] [42516ad406ffb11b68c6b7c720679436] /var/www/arvados-api/shared/vendor_bundle/ruby/2.1.0/gems/activerecord-3.2.22.5/lib/active_record/connection_adapters/postgresql_adapter.rb:1176:in `get_last_result'
[api.arvados.sanger] [42516ad406ffb11b68c6b7c720679436] Error 1489149789+85807e44: 422
[api.arvados.sanger] [42516ad406ffb11b68c6b7c720679436] {"method":"POST","path":"/arvados/v1/job_tasks","format":"json","controller":"arvados/v1/job_tasks","action":"create","status":422,"duration":32.56,"view":0.45,"db":18.46,"params":{"parameters":{"inputs":"5c41dcf66d94aba8b7c1553c39a4520c+23030","name":"55_of_200","interval":"chr5:91121972-109108305","interval_list":"a65feae6f5f8dc407f422586aed7dc26+96","ref":"9a15a1d495a6efa0f3e05f1e851b694e+2227","reuse_job_task":"z8ta6-ot0gb-vi32h3s1a61u01e"},"success":true,"sequence":2,"finished_at":"2017-03-09T21:32:59.000000000Z","created_by_job_task_uuid":"z8ta6-ot0gb-bovzdgts2cmm8i2","progress":1.0,"output":"d2f2c39e81e617b139da9f6ea5cf581a+3511+A8e3cc4d710260d397eea250fdb048f58f0a53c8b@58d43f08","started_at":"2017-03-09T19:07:16.000000000Z","job_uuid":"z8ta6-8i9sb-1u981ftt6rzlgbb","alt":"json","job_task":{"job_uuid":"z8ta6-8i9sb-1u981ftt6rzlgbb","sequence":2,"parameters":{"inputs":"5c41dcf66d94aba8b7c1553c39a4520c+23030","name":"55_of_200","interval":"chr5:91121972-109108305","interval_list":"a65feae6f5f8dc407f422586aed7dc26+96","ref":"9a15a1d495a6efa0f3e05f1e851b694e+2227","reuse_job_task":"z8ta6-ot0gb-vi32h3s1a61u01e"},"output":"d2f2c39e81e617b139da9f6ea5cf581a+3511+A8e3cc4d710260d397eea250fdb048f58f0a53c8b@58d43f08","progress":1.0,"success":true,"created_by_job_task_uuid":"z8ta6-ot0gb-bovzdgts2cmm8i2","started_at":"2017-03-09T19:07:16.000000000Z","finished_at":"2017-03-09T21:32:59.000000000Z"}},"@timestamp":"2017-03-10T12:43:09Z","@version":"1","message":"[422] POST /arvados/v1/job_tasks (arvados/v1/job_tasks#create)"}
 

Actions #3

Updated by Joshua Randall about 8 years ago

arvados_production=> select count(*) from jobs;
 count
-------
 49870
(1 row)

arvados_production=> select count(*) from job_tasks;
  count
---------
 4931305
(1 row)
Actions #4

Updated by Joshua Randall about 8 years ago

I've followed the directions at https://blog.dob.sk/2012/05/19/fixing-pg_dump-invalid-memory-alloc-request-size/ to check the job_tasks table for bad rows, but it didn't find any. Guess there may be other tables joined in on whatever the queries are that are being done during job task creation?

Actions

Also available in: Atom PDF