Project

General

Profile

Bug #18026

Updated by Ward Vandewege over 2 years ago

 
 This is from https://dev.arvados.org/issues/17755#note-26: 
 ===================================================================================================== 

 The first two failures I think are due to a race condition between two crunch-run processes trying to convert and caching the singularity image. 

 This attempt found the collection, but apparently before the cached image had been created.    (not supposed to happen.) 


 > 2021-08-06T21:26:16.608212665Z Using Docker image id "sha256:337550d506a3fc77e30292bba95108f1cd34a33719f0dd997d0de0540522def7" 
 > 2021-08-06T21:26:16.608240987Z Loading Docker image from keep 
 > 2021-08-06T21:26:17.188261275Z building singularity image 
 > 2021-08-06T21:26:17.223285265Z [singularity build /tmp/crunch-run.tordo-dz642-pez87oegh5fgbs7.099258167/keep207354666/by_uuid/tordo-4zz18-t0wx4utpwx4ligf/image.sif docker-archive:///tmp/crunch-run-singularity-917581112/image.tar] 
 > 2021-08-06T21:26:26.466105517Z INFO:      Starting build... 
 > 2021-08-06T21:26:26.466105517Z FATAL:     While performing build: conveyor failed to get: Error loading tar component 337550d506a3fc77e30292bba95108f1cd34a33719f0dd997d0de0540522def7.json: open /tmp/crunch-run-singularity-917581112/image.tar: no such file or directory 
 > 2021-08-06T21:26:26.466234171Z error in Run: While loading container image: exit status 255 
 > 2021-08-06T21:26:26.466268708Z error in CaptureOutput: error scanning files to copy to output: lstat "/var/spool/cwl": lstat /tmp/crunch-run.tordo-dz642-pez87oegh5fgbs7.099258167/tmp701045228: no such file or directory 
 > 2021-08-06T21:26:26.605452965Z Cancelled 


 On the second attempt, it tried to create a collection with the same temporary name (down to the exact timestamp?) and that failed. 


 > 2021-08-06T21:26:47.149336231Z Executing container 'tordo-dz642-amjt50vnz4qyn4n' 
 > ... 
 > 2021-08-06T21:26:47.972965997Z error in Run: While loading container image: error creating 'singularity image for sha256:337550d506a3fc77e30292bba95108f1cd34a33719f0dd997d0de0540522def7 2021-08-06T21:26:47Z' collection: request failed: https://tordo.arvadosapi.com/arvados/v1/collections: 422 Unprocessable Entity: //railsapi.internal/arvados/v1/collections: 422 Unprocessable Entity: #<ActiveRecord::RecordNotUnique: PG::UniqueViolation: ERROR:    duplicate key value violates unique constraint "index_collections_on_owner_uuid_and_name" 
 > 2021-08-06T21:26:47.972965997Z DETAIL:    Key (owner_uuid, name)=(tordo-j7d0g-7p82g804nk5l7gx, singularity image for sha256:337550d506a3fc77e30292bba95108f1cd34a33719f0dd997d0de0540522def7 2021-08-06T21:26:47Z) already exists. 
 > 2021-08-06T21:26:47.972965997Z : INSERT INTO "collections" ("owner_uuid", "created_at", "modified_by_user_uuid", "modified_at", "portable_data_hash", "updated_at", "uuid", "manifest_text", "name", "properties", "delete_at", "file_names", "trash_at", "current_version_uuid") VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14) RETURNING "id"> (req-8t57dqc95orqsvelydce) 
 > 2021-08-06T21:26:48.136224600Z Cancelled 

Back