https://dev.arvados.org/https://dev.arvados.org/favicon.ico?15576888422015-11-11T08:24:44ZArvadosArvados - Idea #7393: [Keep] Prototype S3 blob storagehttps://dev.arvados.org/issues/7393?journal_id=322342015-11-11T08:24:44ZBrett Smithbrett.smith@curii.com
<ul><li><strong>Target version</strong> set to <i>2015-12-02 sprint</i></li></ul> Arvados - Idea #7393: [Keep] Prototype S3 blob storagehttps://dev.arvados.org/issues/7393?journal_id=323102015-11-12T15:27:56ZBrett Smithbrett.smith@curii.com
<ul><li><strong>Target version</strong> changed from <i>2015-12-02 sprint</i> to <i>Arvados Future Sprints</i></li></ul> Arvados - Idea #7393: [Keep] Prototype S3 blob storagehttps://dev.arvados.org/issues/7393?journal_id=325202015-11-17T19:30:03ZTom Cleggtom@curii.com
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/32520/diff?detail_id=31965">diff</a>)</li></ul> Arvados - Idea #7393: [Keep] Prototype S3 blob storagehttps://dev.arvados.org/issues/7393?journal_id=330692015-12-02T20:22:56ZTom Cleggtom@curii.com
<ul><li><strong>Assigned To</strong> set to <i>Tom Clegg</i></li><li><strong>Target version</strong> changed from <i>Arvados Future Sprints</i> to <i>2015-12-16 sprint</i></li></ul> Arvados - Idea #7393: [Keep] Prototype S3 blob storagehttps://dev.arvados.org/issues/7393?journal_id=333372015-12-09T13:43:11ZBrett Smithbrett.smith@curii.com
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>In Progress</i></li></ul> Arvados - Idea #7393: [Keep] Prototype S3 blob storagehttps://dev.arvados.org/issues/7393?journal_id=333572015-12-09T21:30:23ZTom Cleggtom@curii.com
<ul></ul><p>7393-s3-volume is at <a class="changeset" title="7393: Add -uuid and -url options, fix memory sharing in -vary-request." href="https://dev.arvados.org/projects/arvados/repository/arvados/revisions/069704ebbd82ff84106e228b158be0fd78fb5c89">069704e</a> with the following known issues that (I think) we can merge with:</p>
<a name="Delete-vs-write-race"></a>
<h3 >Delete-vs.-write race<a href="#Delete-vs-write-race" class="wiki-anchor">¶</a></h3>
<p>The delete-vs.-write race is not handled. It is possible to write (refresh) an existing block between the time "T0" when the delete handler confirms that the block is old and the time "T1" when the block actually gets deleted. When this happens, PUT reports success even though the block gets deleted right away.</p>
<p>(Aside: AWS does not guarantee the block actually becomes ungettable before "delete" returns, so "T1" can be even later than when keepstore finishes its delete method.)</p>
<strong>Current workarounds:</strong>
<ul>
<li>If you want to be safe and don't mind not having garbage collection, you're fine; delete is disabled by default.</li>
<li>If you want to do garbage collection and you aren't worried about the race, turn on <code>-s3-unsafe-delete</code>.</li>
</ul>
<a name="Odd-error-messages"></a>
<h3 >Odd error messages<a href="#Odd-error-messages" class="wiki-anchor">¶</a></h3>
AWS reports "access denied" instead of 404 when trying to read a nonexistent block during Compare and Get.
<ul>
<li><pre>
2015/12/09 20:56:07 s3-bucket:"4xphq-keep": Compare(637821cc1c31b89272a25c1a6885cc8e): Access Denied
</pre></li>
</ul>
<p>This might just be a problem in the way we've set up our test bucket permissions, though. The s3test stub server throws 404 as expected so we pass the "report notfound as ErrNotExist" tests.</p>
<a name="No-docs"></a>
<h3 >No docs<a href="#No-docs" class="wiki-anchor">¶</a></h3>
<p>...other than the <code>keepstore -help</code> message.</p>
<a name="Non-Amazon-endpoints-are-untested"></a>
<h3 >Non-Amazon endpoints are untested<a href="#Non-Amazon-endpoints-are-untested" class="wiki-anchor">¶</a></h3>
<p>The options are there (<code>-s3-endpoint</code>) for using a non-AWS S3-compatible service like Google Storage, but the only services I've tried it on are AWS and the s3test server from <a class="external" href="https://github.com/AdRoll/goamz">https://github.com/AdRoll/goamz</a>.</p> Arvados - Idea #7393: [Keep] Prototype S3 blob storagehttps://dev.arvados.org/issues/7393?journal_id=333582015-12-09T21:33:04ZTom Cleggtom@curii.com
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/33358/diff?detail_id=32771">diff</a>)</li></ul> Arvados - Idea #7393: [Keep] Prototype S3 blob storagehttps://dev.arvados.org/issues/7393?journal_id=333782015-12-10T16:17:40ZPeter Amstutzpeter.amstutz@curii.com
<ul></ul><p>Tom Clegg wrote:</p>
<blockquote>
<p>7393-s3-volume is at <a class="changeset" title="7393: Add -uuid and -url options, fix memory sharing in -vary-request." href="https://dev.arvados.org/projects/arvados/repository/arvados/revisions/069704ebbd82ff84106e228b158be0fd78fb5c89">069704e</a> with the following known issues that (I think) we can merge with:</p>
<a name="Delete-vs-write-race"></a>
<h3 >Delete-vs.-write race<a href="#Delete-vs-write-race" class="wiki-anchor">¶</a></h3>
<p>The delete-vs.-write race is not handled. It is possible to write (refresh) an existing block between the time "T0" when the delete handler confirms that the block is old and the time "T1" when the block actually gets deleted. When this happens, PUT reports success even though the block gets deleted right away.</p>
<p>(Aside: AWS does not guarantee the block actually becomes ungettable before "delete" returns, so "T1" can be even later than when keepstore finishes its delete method.)</p>
<strong>Current workarounds:</strong>
<ul>
<li>If you want to be safe and don't mind not having garbage collection, you're fine; delete is disabled by default.</li>
<li>If you want to do garbage collection and you aren't worried about the race, turn on <code>-s3-unsafe-delete</code>.</li>
</ul>
</blockquote>
<p>I spent a while reading the S3 documentation. The correct way to do this seems to be to enable versioning on the bucket. Then the head-and-delete operation will only delete the specific version of the object. This should solve the race because if there is a PUT or PUT-copy it will show up as a more recent version. As a side effect the "PUT-copy" operation used for Touch() may need to explicitly delete the old version.</p> Arvados - Idea #7393: [Keep] Prototype S3 blob storagehttps://dev.arvados.org/issues/7393?journal_id=333792015-12-10T16:58:49ZPeter Amstutzpeter.amstutz@curii.com
<ul></ul><p>Object versioning in S3 compatible APIs:</p>
<p>Google:</p>
<p>Has a "generation" parameter that is very similar to Amazon's "versionId", except that it's a 64 bit integer where S3 uses a string.</p>
<p><a class="external" href="https://cloud.google.com/storage/docs/object-versioning?hl=en">https://cloud.google.com/storage/docs/object-versioning?hl=en</a></p>
<p>Ceph:</p>
<p>"x-amz-version-id" is listed under "Unsupported header fields" and no mention of versioning in the documentation.</p>
<p><a class="external" href="http://docs.ceph.com/docs/master/radosgw/s3/">http://docs.ceph.com/docs/master/radosgw/s3/</a></p> Arvados - Idea #7393: [Keep] Prototype S3 blob storagehttps://dev.arvados.org/issues/7393?journal_id=333802015-12-10T18:49:45ZPeter Amstutzpeter.amstutz@curii.com
<ul></ul><p>s3_volume_test has some commented out code.</p> Arvados - Idea #7393: [Keep] Prototype S3 blob storagehttps://dev.arvados.org/issues/7393?journal_id=333812015-12-10T18:52:36ZPeter Amstutzpeter.amstutz@curii.com
<ul></ul><pre>
(01:43:43 PM) Walex: gah, I actually came back before I forget, to say something obvious but that may be useful: the "standard" way to avoid this problem in distributed filesystems is to allow data operations to be done by any "keepstore", but to get all metadata operations to be done only by one "keepstore", e.g. the one "with the lowest IP address" as in AFS, or the one that managed first to acquire a certain "well known" lock. You could use that for Ceph but not the other syste
(01:48:33 PM) tetron_: Walex: actually, that's a great idea
(01:48:40 PM) tetron_: Walex: you're probably gone now
(01:48:58 PM) tetron_: Walex: but yea, we could have 1 writable server and N read-only servers
</pre>
<p>Would require some locking between between trash list and PUT handler in keepstore itself (maybe a another story).</p> Arvados - Idea #7393: [Keep] Prototype S3 blob storagehttps://dev.arvados.org/issues/7393?journal_id=333862015-12-10T20:38:16ZPeter Amstutzpeter.amstutz@curii.com
<ul></ul><p>One detail to check:</p>
<p>The S3 documentation for PUT-copy specifies:</p>
<pre>x-amz-copy-source: /source_bucket/sourceObject</pre>
<p>However the code constructs this string:</p>
<pre>v.Bucket.Name+"/"+loc</pre>
<p>Is the first '/' being added somewhere, or is S3 accepting it without the leading slash?</p> Arvados - Idea #7393: [Keep] Prototype S3 blob storagehttps://dev.arvados.org/issues/7393?journal_id=333932015-12-10T21:06:06ZTom Cleggtom@curii.com
<ul></ul><p>Peter Amstutz wrote:</p>
<blockquote>
<p>Is the first '/' being added somewhere, or is S3 accepting it without the leading slash?</p>
</blockquote>
<p>Interesting. The goamz s3 package leaves out the leading '/', s3test doesn't tolerate one, and amazon seems to add it implicitly if you leave it off (keep-exercise did lots of "touch" operations without any trouble)... I'd say this should be fixed in the SDK first, and then (depending on how the SDK fixes it) we should update our code.</p>
<blockquote>
<p>s3_volume_test has some commented out code.</p>
</blockquote>
<p>Whoops, removed. Thanks.</p> Arvados - Idea #7393: [Keep] Prototype S3 blob storagehttps://dev.arvados.org/issues/7393?journal_id=333972015-12-10T21:20:10ZTom Cleggtom@curii.com
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Resolved</i></li><li><strong>% Done</strong> changed from <i>50</i> to <i>100</i></li></ul><p>Applied in changeset arvados|commit:7d5d57a522489209e6b3cecfef94bab0aae4a7f5.</p>