Project

General

Profile

Bug #20561

Updated by Peter Amstutz 10 months ago

"Maximum container memory rss usage" 

 then nothing for almost 2 hours, then finishes up with 

 "copying /file.txt (200000 bytes)" 
 "maximum keepstore memory rss" 
 ... 
 Completed 

 On further investigation. 

 The output collection has ~4400 files, but except for the one file that was reported as being copied, it looks like these are staged to an intermediate collection and then made to appear in the output directory, and then propagated to the output collection. 

 So it seems like it is doing something that causes it to iterate over each of the 4400 files, it only needs to take 1.5s to process each file for that to add up to nearly two hours. 

 The input consists of an array of 4400 files, each file is pulled from a different collection, so I think what is happening is that it is sequentially fetching 4400 collections with manifest text. 

 Things to do: 

 # Log that this is happening (print out each file being added) 
 # We don't actually _need_ these files in the output at all, we should support a regex filter on what gets collected for the output collection and don't upload or propagate files that the user doesn't want.    There's actually (There was a really old ticket for this! #9964 about this a long time ago, I can't find it now, maybe I'll have to write a new one). 

Back