Project

General

Profile

Actions

Feature #4823

closed

[SDKs] Good Collection API for Python SDK

Added by Tim Pierce over 9 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
-
Category:
SDKs
Target version:
-
Story points:
1.0

Description

Goals

More enjoyable API for Python programmers to use. Something like:
  • c=Collection(...)
    with c.open('foo.txt', 'w') as f:
      f.write('foo')
    with c.open('foo.txt', 'a') as f:
      f.write('bar')
    c.rename('foo.txt', 'baz/foobar.txt')
    with c.open('baz/foobar.txt', 'r') as f:
      print f.read()
    c.save()
    
Serialize/unserialize (manifest) code all in once place.
  • Abstract away the "manifest" encoding as much as possible to pave the way for upgrading/replacing it (say, with a richer JSON format).
  • Only one version of tokenizing/regexp parsing, string concatenation, making sure zero-length streams have a zero-length block locator, stuff like that.
In-memory data structure suitable for mutable collections.
  • Accommodate use of "data buffer" blocks for data not yet written to Keep.
  • Simplify file operations by using a distinct piece of memory for each file. (Modifying a stream in order to modify a file, without disrupting other files in the stream, is painful!)
  • See #4837

Collection interface

Collection()
  • Create a new empty collection.
Collection(uuid)
  • Retrieve the given collection from the API server.
Collection(manifest_text)
  • Create a new collection with the given content.
modified()
  • Return True if the collection has been modified since it was last retrieved or saved to the API server, otherwise False.
manifest_text()
  • Return the "manifest" string representation of this collection. This implicitly commits all buffered data to disk.
portable_manifest_text()
  • Return the "portable manifest" string representation of this collection used to compute portable_data_hash -- i.e., the manifest with the non-portable parts (like Keep permission signatures) removed. This can always be done without flushing any data to disk.
portable_data_hash()
  • Return the portable_data_hash that would be accepted/assigned by the API server if the collection were save()d right now.
listdir(path)
  • Return a list containing the names of the entries in the subcollection given by path.
walk(path, topdown=True, onerror=None)
  • (As close as possible to os.walk().) Generate the file names in a directory tree. For each subcollection (below and including path, where '.' is the whole collection) yield a 3-tuple (dirpath, dirnames, filenames).
remove(path)
  • Remove the file or subcollection named path.
unlink(path)
  • Alias for remove.
rename(old, new)
  • Rename a file from old to new.
rename(old, new, dest_collection)
  • Move a file old (in this collection) to new (in a different collection).
  • Rejected. Use copy + remove instead.
copy(old, new)
  • Create a new file new with the same content old has right now.
copy(source_path, dest_path, source_collection=None, overwrite=False)
  • Create a new file dest_path, with the same content the file at source_path has right now. If source_collection is not None, source_path refers to a file in source_collection, which might not be self. If dest_path already exists and overwrite is False, raise an exception.
open(filename, mode)
  • Semantics as close as practicable to open(). Return an object with (some subset of) the Python "file" interface.
glob(globpattern)
  • Returns an iterator that yields successive files that match globpattern from the collection.
  • Rejected. Use [f for f in fnmatch.filter(c.listdir(path), '*.o')] instead.

Subtasks 3 (0 open3 closed)

Task #4893: Review 4823-python-sdk-writable-collection-apiResolvedPeter Amstutz12/17/2014Actions
Task #5103: Implement live sync, mergeResolvedPeter Amstutz12/17/2014Actions
Task #4837: [SDKs] Define API and in-memory data structure for collections in Python SDKResolvedPeter Amstutz12/17/2014Actions

Related issues

Related to Arvados - Idea #3198: [FUSE] Writable streaming arv-mountResolvedPeter Amstutz04/14/2015Actions
Related to Arvados - Idea #4930: [FUSE] Design: specify behavior for writable arv-mountResolvedPeter Amstutz01/28/2015Actions
Actions

Also available in: Atom PDF