Project

General

Profile

Bug #21891

Updated by Peter Amstutz about 1 month ago

Reported very slow copying files from collections to collections in output phase (~15s per file). 

 Example log: 

 <pre> 
 2024-06-11T20:59:18.902328206Z copying "PT001379858_A.dedup.flagstat" from d75eba3924f3985128d7a2dbb7f59cf6+1786208/PT001379858_A.dedup.flagstat 
 2024-06-11T20:59:35.158068026Z copying "PT003617020_A.insertsizemetricalgo.metrics.txt" from d75eba3924f3985128d7a2dbb7f59cf6+1786208/PT003617020_A.insertsizemetricalgo.metrics.txt 
 2024-06-11T20:59:50.741935364Z copying "PT005899462_A.insertsizemetricalgo.metrics.txt" from d75eba3924f3985128d7a2dbb7f59cf6+1786208/PT005899462_A.insertsizemetricalgo.metrics.txt 
 </pre> 

 We've had other reports of performance issues with the copying phase where it is copying from collections in the past, but I assumed it was because it was loading collections from the API server, and that was slow. server slowly.    In this case, these are from the same manifest, so it should be using @manifestCache@ (unless that's not working, for some reason, but monitoring also doesn't indicate substantial network traffic, so it doesn't appear to be spending most of its time fetching stuff). 

 Maybe But maybe the problem is actually the efficiency of the @Extract()@ method? 

 Also, appending strings repeatedly the way it is being done here is a notorious anti-pattern -- unless Go strings have some special buffer behavior, string appends like this require allocating a new buffer and copying the whole string in order to append text to the end, which tends to dominate runtime as the string gets larger. 

 <pre> 
		 cp.logger.Printf("copying %q from %v/%v", outputRelPath, srcMount.PortableDataHash, strings.TrimPrefix(srcRelPath, "./")) 
		 mft, err := cp.getManifest(srcMount.PortableDataHash) 
		 if err != nil { 
			 return err 
		 } 
		 cp.manifest += mft.Extract(srcRelPath, dest).Text 
 </pre> 

Back