Project

General

Profile

Actions

Bug #20561

closed

Log when files from input are being propagated to output in crunch-run finalization

Added by Peter Amstutz 9 months ago. Updated 9 months ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Crunch
Target version:
Story points:
0.5
Release relationship:
Auto

Description

"Maximum container memory rss usage"

then nothing for almost 2 hours, then finishes up with

"copying /file.txt (200000 bytes)"
"maximum keepstore memory rss"
...
Completed

On further investigation.

The output collection has ~4400 files, but except for the one file that was reported as being copied, it looks like these are staged to an intermediate collection and then made to appear in the output directory, and then propagated to the output collection.

So it seems like it is doing something that causes it to iterate over each of the 4400 files, it only needs to take 1.5s to process each file for that to add up to nearly two hours.

The input consists of an array of 4400 files, each file is pulled from a different collection, so I think what is happening is that it is sequentially fetching 4400 collections with manifest text.

Things to do:

  1. Log that this is happening (print out each file being added)
  2. We don't actually need these files in the output at all, we should support a regex filter on what gets collected for the output collection and don't upload or propagate files that the user doesn't want. There's actually a really old ticket for this! #9964

Subtasks 1 (0 open1 closed)

Task #20562: Review 20561-file-copy-loggingResolvedPeter Amstutz05/30/2023Actions

Related issues

Related to Arvados - Feature #9964: arvados-cwl-runner limits output data to keep using output_globNewActions
Actions #1

Updated by Peter Amstutz 9 months ago

  • Status changed from New to In Progress
Actions #2

Updated by Peter Amstutz 9 months ago

  • Subject changed from Weird delay in crunch-run finalization to Unexplained delay in crunch-run finalization
Actions #3

Updated by Peter Amstutz 9 months ago

  • Description updated (diff)
Actions #4

Updated by Brett Smith 9 months ago

Is it possible during this time, arv-mount/local Keepstore is uploading data that was written out by the container process before exit?

Actions #5

Updated by Peter Amstutz 9 months ago

On further investigation.

The output collection has ~4400 files, but except for the one file that was reported as being copied, it looks like these are staged to an intermediate collection and then made to appear in the output directory, and then propagated to the output collection.

So it seems like it is doing something that causes it to iterate over each of the 4400 files, it only needs to take 1.5s to process each file for that to add up to nearly two hours.

The input consists of an array of 4400 files, each file is pulled from a different collection, so I think what is happening is that it is sequentially fetching 4400 collections with manifest text.

Actions #6

Updated by Peter Amstutz 9 months ago

  • Target version changed from Development 2023-06-07 to Future
  • Description updated (diff)
Actions #7

Updated by Peter Amstutz 9 months ago

  • Subject changed from Unexplained delay in crunch-run finalization to Log when files from input are being propagated to output in crunch-run finalization
Actions #8

Updated by Peter Amstutz 9 months ago

  • Story points set to 0.5
  • Target version changed from Future to Development 2023-06-07
Actions #9

Updated by Peter Amstutz 9 months ago

  • Assigned To set to Peter Amstutz
Actions #11

Updated by Peter Amstutz 9 months ago

  • Related to Feature #9964: arvados-cwl-runner limits output data to keep using output_glob added
Actions #12

Updated by Peter Amstutz 9 months ago

  • Description updated (diff)
Actions #13

Updated by Peter Amstutz 9 months ago

  • Category set to Crunch
Actions #15

Updated by Brett Smith 9 months ago

Peter Amstutz wrote in #note-14:

20561-file-copy-logging @ 790e737b8c020f7a339ca88ee6106ae157c465fe

Looks good to me, thanks.

Actions #16

Updated by Peter Amstutz 9 months ago

  • Release set to 64
Actions #17

Updated by Peter Amstutz 9 months ago

  • Status changed from In Progress to Resolved
Actions

Also available in: Atom PDF