Project

General

Profile

Actions

Feature #16513

closed

Get reference Keep performance numbers for Keep-on-S3

Added by Ward Vandewege almost 4 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Story points:
-
Release relationship:
Auto

Subtasks 1 (0 open1 closed)

Task #16528: review 16513-keep-exercise-improvementsResolvedWard Vandewege06/15/2020Actions

Related issues

Related to Arvados - Idea #10477: [keepstore] switch s3 driver from goamz to a more actively maintained client libraryResolvedWard Vandewege11/08/2016Actions
Related to Arvados - Feature #16518: [keep] Allow clients to set a header to disable md5sum calculations in keepstoreNewActions
Related to Arvados - Feature #16519: [keepstore] optimize md5sum calculationsNewActions
Blocks Arvados Epics - Idea #16516: Run Keepstore on local compute nodesResolved10/01/202111/30/2021Actions
Actions #1

Updated by Ward Vandewege almost 4 years ago

  • Related to Idea #16514: Actionable insight into keep usage added
Actions #2

Updated by Ward Vandewege almost 4 years ago

  • Related to deleted (Idea #16514: Actionable insight into keep usage)
Actions #3

Updated by Ward Vandewege almost 4 years ago

  • Blocks Idea #16516: Run Keepstore on local compute nodes added
Actions #4

Updated by Ward Vandewege almost 4 years ago

e710f1b2da3095d6152ac7f6ed1ffab8bfc2c0c7 on branch 16513-keep-exercise-improvements is ready for review.

Actions #5

Updated by Ward Vandewege almost 4 years ago

  • Target version set to 2020-06-17 Sprint
  • Status changed from New to In Progress
Actions #6

Updated by Tom Clegg almost 4 years ago

I have a few nits / suggested improvements but you could ignore them and/or merge e710f1b in the meantime.

Repeating the expression float64(bytesOut) / elapsed.Seconds() / 1048576 is a bit crufty. Should probably compute that once as rateOut and then use it 3 times.

We probably don't need 2 different stats reporting formats. We could print the header line at start, then print a CSV row once every stats-interval plus one at the end.

Printing the final summary on SIGINT/SIGALRM would be a nice touch. (then "alarm 60 keep-exercise ..." would work well, fwiw)

endChan could be a Timer rather than a Ticker. context.WithDeadline() and <-ctx.Done() would be another way to do it.

If we send the CSV data to stdout and logs to stderr, we'll be more ... | tee stats.csv -friendly.

Actions #7

Updated by Ward Vandewege almost 4 years ago

  • Target version changed from 2020-06-17 Sprint to 2020-07-01 Sprint
Actions #8

Updated by Ward Vandewege almost 4 years ago

Tom Clegg wrote:

I have a few nits / suggested improvements but you could ignore them and/or merge e710f1b in the meantime.

Repeating the expression float64(bytesOut) / elapsed.Seconds() / 1048576 is a bit crufty. Should probably compute that once as rateOut and then use it 3 times.

We probably don't need 2 different stats reporting formats. We could print the header line at start, then print a CSV row once every stats-interval plus one at the end.

Printing the final summary on SIGINT/SIGALRM would be a nice touch. (then "alarm 60 keep-exercise ..." would work well, fwiw)

endChan could be a Timer rather than a Ticker. context.WithDeadline() and <-ctx.Done() would be another way to do it.

If we send the CSV data to stdout and logs to stderr, we'll be more ... | tee stats.csv -friendly.

I've implemented everything in cba1b4145e8fcc57a851839f77fd020e5aaff722, ready for another look.

Actions #9

Updated by Tom Clegg almost 4 years ago

LGTM @ a5a6111e3, thanks!

Actions #10

Updated by Ward Vandewege almost 4 years ago

Arvados version: 2.0.2; AWS VPC with S3 endpoint

Single-threaded write to Keep backed by S3: ~42 MiB/sec
Single-threaded read from Keep backed by S3: ~62 MiB/sec

Single-threaded write to S3 with a 3rd party client (s3-cli): ~46 MiB/sec
Single-threaded read from S3 with a 3rd party client (s3-cli): ~106 MiB/sec

It's worth noting that S3 and Keep are optimized for aggregate throughput. With X reader/writer processes, you would expect to see roughly X times the single thread performance, up to the capacity (CPU/bandwidth/memory) of the keepstores (and the clients, but these tend to be spread out over many machines).

That said, we have identified a few areas for future improvement:

a) Keep write to S3 does not currently use multipart writes, because the S3 library we use does not support it. Using multipart writes is recommended to increase write throughput. We are looking into adopting the official AWS S3 go library (#10477). Our Keep S3 backend predates the official AWS S3 go library.

b) Keep's single-threaded read performance: some of the slowdown is caused by the md5sum that Keepstore does on reading every block. We are considering adding an option to disable the md5sum on read in Keepstore (#16518). We are investigating additional performance improvements as well (e.g. #16519).

Actions #11

Updated by Ward Vandewege almost 4 years ago

  • Related to Idea #10477: [keepstore] switch s3 driver from goamz to a more actively maintained client library added
Actions #12

Updated by Ward Vandewege almost 4 years ago

  • Related to Feature #16518: [keep] Allow clients to set a header to disable md5sum calculations in keepstore added
Actions #13

Updated by Ward Vandewege almost 4 years ago

  • Related to Feature #16519: [keepstore] optimize md5sum calculations added
Actions #14

Updated by Ward Vandewege almost 4 years ago

  • Status changed from In Progress to Resolved
Actions #16

Updated by Peter Amstutz over 3 years ago

  • Release set to 25
Actions

Also available in: Atom PDF