Project

General

Profile

Actions

Bug #10932

closed

[arv-put] job resume too slow & uninformative

Added by Tom Morris almost 8 years ago. Updated almost 8 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Story points:
0.5

Description

Completed 97% if a 636959M upload (637GB) and then attempted to resume it after arv-put was killed. It took almost 3 hours (175 minutes) of 100% CPU for arv-put to calculate/confirm the restart point (which seemed to only be 45%, not 97%) before beginning to do any additional uploading. During this time there was no user feedback other than "0M / 636959M 0.0%" for hours on end/

Some other, possibly relevant, details:

98 MB cache file in ~/.cache/arvados/ :

97913649 Jan 18 21:54 476acf1a6d4cfb9b630a13289c4a72be

of which ~7 MB is manifest text:

$ jq .manifest 476acf1a6d4cfb9b630a13289c4a72be | wc
1 240325 6861939

and the rest is info for the 478502 filesL

$ jq '.files | length' 476acf1a6d4cfb9b630a13289c4a72be
478502
$ jq -r '.files[] | .size ' 476acf1a6d4cfb9b630a13289c4a72be | wc
478502 478502 3827704
$ jq -r '.files[] | .mtime ' 476acf1a6d4cfb9b630a13289c4a72be | wc
478502 478502 8928344
$ jq -r '.files | keys ' 476acf1a6d4cfb9b630a13289c4a72be | wc
478504 480984 69204103

That's 69 MB of file names which are compressable to 1.4 MB due to high levels of redundancy

$ jq -r '.files | keys ' 476acf1a6d4cfb9b630a13289c4a72be | gzip | wc
10074 39234 1442694


Subtasks 1 (0 open1 closed)

Task #11057: Review 10932-arvput-slow-resumingResolvedLucas Di Pentima02/02/2017Actions
Actions

Also available in: Atom PDF