Project

General

Profile

Actions

Bug #4309

open

[SDK] arv-copy collection copy performance

Added by Peter Amstutz over 9 years ago. Updated about 2 months ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
Story points:
2.0
Release:
Release relationship:
Auto

Description

Originally https://arvados.org/issues/3699#note-43

I'm copying a pipeline with a 5990M collection. I noticed this code:

                    data = src_keep.get(word)
                    dst_locator = dst_keep.put(data)

See attached image, there's a very clear falloff between blocks -- doing this sequentially isn't optimal. Download and upload could proceed concurrently. Also, it's possible we could get better utilization if we transferred multiple blocks at a time (e.g. 2x down / 2x up) by talking to multiple Keep servers. Consider a producer-consumer pattern using Python queues.


Files

arv-copy-perf.png (25.8 KB) arv-copy-perf.png Peter Amstutz, 10/24/2014 05:22 PM
Actions #1

Updated by Peter Amstutz over 9 years ago

Actions #2

Updated by Peter Amstutz over 9 years ago

  • Description updated (diff)
Actions #3

Updated by Peter Amstutz over 9 years ago

  • Description updated (diff)
Actions #4

Updated by Peter Amstutz over 9 years ago

  • Subject changed from [SDK] arv-copy performance to [SDK] arv-copy collection copy performance
  • Description updated (diff)
Actions #5

Updated by Radhika Chippada over 9 years ago

  • Target version set to Bug Triage
Actions #6

Updated by Tom Clegg over 9 years ago

  • Target version changed from Bug Triage to Arvados Future Sprints
Actions #7

Updated by Ward Vandewege over 9 years ago

  • Story points set to 2.0
Actions #8

Updated by Ward Vandewege almost 3 years ago

  • Target version deleted (Arvados Future Sprints)
Actions #9

Updated by Peter Amstutz about 1 year ago

  • Release set to 60
Actions #10

Updated by Peter Amstutz about 2 months ago

  • Target version set to Future
Actions

Also available in: Atom PDF