Story #2622

Datamanager outputs garbage collection list

Added by Misha Zatsman over 7 years ago. Updated over 7 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Misha Zatsman
Category:
-
Start date:
04/16/2014
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
1.0

Description

The python datamanager will output a CSV file with the following format:

block uuid, latest mtime, disk size, cumulative size, disk free

These are blocks which exist on keep that no one has persisted, sorted by increasing mtime. The columns are:
block uuid: The id of the block we want to delete
latest mtime: The latest mtime of the block across all keep servers.
disk size: The total disk space used by this block (block size multiplied by current replication level)
cumulative disk size: The sum of this block's disk size and all the blocks listed above it
disk free: The proportion of our disk space that would be free if we deleted this block and all the above. So this is (current free disk space - cumulative disk size) / total disk capacity


Subtasks

Task #2690: Output garbage collection listResolvedMisha Zatsman

Task #2689: Move datamanager to experimental directory and submit ResolvedMisha Zatsman

Associated revisions

Revision a0a7b1a0 (diff)
Added by Misha Zatsman over 7 years ago

Added printing of garbage collection report to CSV file. Fixed bug in free disk space computation. Closes #2622

History

#1 Updated by Misha Zatsman over 7 years ago

  • Tracker changed from Bug to Story

#2 Updated by Misha Zatsman over 7 years ago

  • Project changed from Arvados Private to Arvados

#3 Updated by Misha Zatsman over 7 years ago

  • Target version set to 2014-05-07 Storing and Organizing Data

#4 Updated by Misha Zatsman over 7 years ago

  • Story points set to 1.0

#5 Updated by Misha Zatsman over 7 years ago

  • Description updated (diff)

#6 Updated by Misha Zatsman over 7 years ago

  • Status changed from New to Resolved

Also available in: Atom PDF