Project

General

Profile

Actions

Bug #8497

closed

[Data Manager] Small batch size makes it slow to process collections

Added by Joshua Randall about 8 years ago. Updated about 8 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Keep
Target version:
Story points:
0.5

Description

datamanager queries the API server for collections in batches of 50. This is very slow.

Performance data on our system (running with the fix from 8485 as otherwise we can't fetch all the collections):

$ time ./datamanager -dry-run &> /tmp/datamanager-dry-run-50.log

real    72m30.514s
user    6m30.243s
sys     0m43.171s

Changing one line in datamanager.go from 'BatchSize: 50' to 'BatchSize: 1000' results in:

$ time ./datamanager -dry-run &> /tmp/datamanager-dry-run-1000.log
real    12m57.729s
user    5m16.569s
sys     0m28.488s

I'd suggest raising the BatchSize as much as possible (or making it a configuration parameter).


Subtasks 1 (0 open1 closed)

Task #8520: Review PR #41ResolvedRadhika Chippada02/23/2016Actions
Actions

Also available in: Atom PDF