Project

General

Profile

Bug #17118

Updated by Peter Amstutz over 3 years ago

Reported by user that arv-put would upload a directory of files, and then sometimes hang before writing the collection.    However, the checkpoint file was written, so canceling the process and re-running arv-put would create the collection without waiting for a re-upload. 

 Inspect the code and see if there are any places that seems vulnerable to a deadlock. 

 Here's the follow-up (https://support.curii.com/rt/Ticket/Display.html?id=119) 

 I would like to report a possible bug/improvement for the arv-put 
 command. We ran into some issues when using arv-put where it would die 
 silently without giving any output whatsoever. We have now traced it 
 to the fact that the arv-put cmd essentially runs out of memory (or 
 uses a huge amount of memory). 

 The setup: 
 1. A folder containing a number of files (< 1500) with a total folder 
 size of 145GB. This entire folder is to be uploaded into Arvados. 
 2. We run it via Gitlab as a Runner on a Virtual Machine with 16GB of RAM. 
 3. The arv-put cmd we use: 
 arv-put --no-follow-links --no-resume --exclude 'Thumbnail_Images/*' 
 --exclude done.txt --project-uuid arkau-j7d0g-6a3em925c3yvx9q --name 
 Overnight1 /isilon/nrd_hca/Overnight1/ 

 Output: 
 1. The script silently dies, no error message, no other output. 

 We have done extensive testing and checking and initially, the arv-put 
 cmd just died silently without giving any error message whatsoever. 
 After some digging, it turns out that arv-put cmd essentially eats up 
 all the memory on the machine and is then killed. We tried to change 
 it so that arrv-put can only use 1 thread but the outcome is the same. 
 See the attached images for the output from 'top' when trying to 
 upload the 145GB folder. We have plans in the future to upload folders 
 with around 750GB of data and if arv-put cannot handle this or needs a 
 huge amount of memory to do this, we will need to reconsider our 
 workflows. 

 We have a couple of questions: 
 1. What is the relationship between the size of the folder to be 
 uploaded and the amount of memory arv-put will use? 
 2. Is there a way to estimate how much memory would be needed for a 
 certain folder/size of data? 
 3. Is there a way to make arv-put fail gracefully in cases like this? 
 4. If known, what is the reason that arv-put uses so much memory? 

Back