Bug #11238

job_task creation fails with ApiError - HttpError 422 - ActiveRecord::StatementInvalid: PG::InternalError: ERROR: invalid memory alloc request size 1718630765

Added by Joshua Randall 9 months ago. Updated 4 months ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
Start date:
03/10/2017
Due date:
% Done:

0%

Estimated time:
Story points:
-

Description

For some reason some of our jobs are failing with an API server error indicating a postgres statement was invalid because it tried to malloc 1.7GB of RAM!?

From a job log:

2017-03-10_12:43:38 z8ta6-8i9sb-1u981ftt6rzlgbb 52742 151 stderr arvados.errors.ApiError: <HttpError 422 when requesting https://api.arvados.sanger.ac.uk/arvados/v1/job_tasks?alt=json returned "#<ActiveRecord::StatementInvalid: PG::InternalError: ERROR:  invalid memory alloc request size 1718630765

History

#1 Updated by Joshua Randall 9 months ago

These appear to be the corresponding log from the api server production.log (there does not appear to be any mention of the issue in our postgres logs):

[api.arvados.sanger] [42516ad406ffb11b68c6b7c720679436] WARNING: Can't verify CSRF token authenticity
[api.arvados.sanger] [42516ad406ffb11b68c6b7c720679436] #<ActiveRecord::StatementInvalid: PG::InternalError: ERROR:  invalid memory alloc request size 1718630765
[api.arvados.sanger] [42516ad406ffb11b68c6b7c720679436] /var/www/arvados-api/shared/vendor_bundle/ruby/2.1.0/gems/activerecord-3.2.22.5/lib/active_record/connection_adapters/postgresql_adapter.rb:1176:in `get_last_result'
[api.arvados.sanger] [42516ad406ffb11b68c6b7c720679436] Error 1489149789+85807e44: 422
[api.arvados.sanger] [42516ad406ffb11b68c6b7c720679436] {"method":"POST","path":"/arvados/v1/job_tasks","format":"json","controller":"arvados/v1/job_tasks","action":"create","status":422,"duration":32.56,"view":0.45,"db":18.46,"params":{"parameters":{"inputs":"5c41dcf66d94aba8b7c1553c39a4520c+23030","name":"55_of_200","interval":"chr5:91121972-109108305","interval_list":"a65feae6f5f8dc407f422586aed7dc26+96","ref":"9a15a1d495a6efa0f3e05f1e851b694e+2227","reuse_job_task":"z8ta6-ot0gb-vi32h3s1a61u01e"},"success":true,"sequence":2,"finished_at":"2017-03-09T21:32:59.000000000Z","created_by_job_task_uuid":"z8ta6-ot0gb-bovzdgts2cmm8i2","progress":1.0,"output":"d2f2c39e81e617b139da9f6ea5cf581a+3511+A8e3cc4d710260d397eea250fdb048f58f0a53c8b@58d43f08","started_at":"2017-03-09T19:07:16.000000000Z","job_uuid":"z8ta6-8i9sb-1u981ftt6rzlgbb","alt":"json","job_task":{"job_uuid":"z8ta6-8i9sb-1u981ftt6rzlgbb","sequence":2,"parameters":{"inputs":"5c41dcf66d94aba8b7c1553c39a4520c+23030","name":"55_of_200","interval":"chr5:91121972-109108305","interval_list":"a65feae6f5f8dc407f422586aed7dc26+96","ref":"9a15a1d495a6efa0f3e05f1e851b694e+2227","reuse_job_task":"z8ta6-ot0gb-vi32h3s1a61u01e"},"output":"d2f2c39e81e617b139da9f6ea5cf581a+3511+A8e3cc4d710260d397eea250fdb048f58f0a53c8b@58d43f08","progress":1.0,"success":true,"created_by_job_task_uuid":"z8ta6-ot0gb-bovzdgts2cmm8i2","started_at":"2017-03-09T19:07:16.000000000Z","finished_at":"2017-03-09T21:32:59.000000000Z"}},"@timestamp":"2017-03-10T12:43:09Z","@version":"1","message":"[422] POST /arvados/v1/job_tasks (arvados/v1/job_tasks#create)"}
 

#2 Updated by Joshua Randall 9 months ago

  • Subject changed from ApiError - HttpError 422 - ActiveRecord::StatementInvalid: PG::InternalError: ERROR: invalid memory alloc request size 1718630765 to job_task creation fails with ApiError - HttpError 422 - ActiveRecord::StatementInvalid: PG::InternalError: ERROR: invalid memory alloc request size 1718630765

#3 Updated by Joshua Randall 9 months ago

arvados_production=> select count(*) from jobs;
 count
-------
 49870
(1 row)

arvados_production=> select count(*) from job_tasks;
  count
---------
 4931305
(1 row)

#4 Updated by Joshua Randall 9 months ago

I've followed the directions at https://blog.dob.sk/2012/05/19/fixing-pg_dump-invalid-memory-alloc-request-size/ to check the job_tasks table for bad rows, but it didn't find any. Guess there may be other tables joined in on whatever the queries are that are being done during job task creation?

#5 Updated by Tom Morris 4 months ago

  • Target version set to Arvados Future Sprints

Also available in: Atom PDF