Idea #11308
closedSupport Python3 for arvados-python-client & command line utilities
Description
Python 3 has been available for many years and the Python 2/3 migration is reaching its final stages. As a first step, Arvados Python SDK + command line needs to run on both Python 3 and Python 2.
run-tests.sh needs to run tests using both Python 2.7 and Python 3.4+
Dependencies must support Python 3. May require updating dependencies and related side effects (API/behavior changes).
Use "__future__" to adopt Python 3 behavior on Python 2.7:
from __future__ import division, absolute_import, print_function, unicode_literals
Clean up usage of strings to conform to Python 3 unicode/bytes distinction.
Existing Arvados Python packages including FUSE, Node manager, arvados-cwl-runner (+ its dependencies, e.g. cwltool), and crunchstat-summary must continue to work on Python 2.7 (they will be ported in future stories). Ideally this can be resolved without divergent behavior between Py2 and Py3, however if there is a conflict between Py2 and Py3 behavior it needs to be resolved in favor of maintaining compatibility with existing dependencies on Py2.
Packages should be published to PyPi advertising compatibility with both Py2+Py3 .
Will need to build and publish separate Py2 and Py3 deb/rpm packages (due to hard dependency on Python interpreter).
Updated by Tom Morris over 7 years ago
We appear to have suffered from serious scope creep here. Somehow we went from SDK & command line tools to almost all of our Python code. This ticket is only about the Python SDK and the command line utilities. (When we do all Python code, we'll want to include crunchstat-summary & arvados-docker-cleaner in addition to nodemanager & fuse driver).
pip install caniusepython3 find . -name PKG-INFO -exec caniusepython3 -m {} \; -print
claims that we're all set on the dependencies front with all packages having Python 3 compatible versions.
The Python 2->3 porting guide is here: https://docs.python.org/3/howto/pyporting.html
Updated by Tom Clegg over 7 years ago
- Category set to SDKs
- Assigned To set to Tom Clegg
Updated by Tom Clegg over 7 years ago
virtualenv --python python3 /tmp/v3 . /tmp/v3/bin/activate git clone -b 11308-python3 https://github.com/curoverse/arvados.git cd arvados/sdk/python python setup.py install python <<EOF import arvados a=arvados.api() print("# api request") print repr(a.collections().list(count='none').execute()['items'][0]) print("# read data from keep") c = arvados.collection.Collection('4xphq-4zz18-09pu7btnmkhaxky') with c.open(c.keys()[0]) as f: f.seek(1234) print(f.readline()) # text print(f.read(1234)) # binary print("# write data to keep") c = arvados.collection.Collection() with c.open('foo.txt', 'wb') as f: f.write(b'foo') f.close() c.save_new() # '. acbd18db4cc2f85cedef654fccc4a4d8+3+A53d2e4dee0300ccd35c2b6b23d9a660cd6b41268@58f62cfb 0:3:foo.txt\n' print(c.portable_data_hash()) # 83367e8913dcec0bf3fc25ed5a27eacb+49 print(c.manifest_locator()) # 4xphq-4zz18-nzwz3de4at90zaz EOFTodo:
- Currently all files are opened in binary mode (read() returns bytes and write() expects bytes) even without 'b' in the mode argument to open(). We should implement text mode -- or at least force Python 3 callers to specify 'b' in mode, otherwise people will write Python 3 code that works now but breaks when we implement text mode.
- Build deb/rpm packages
- Improve run-tests.sh output (currently the py2sdk and py3sdk test suites are titled just "sdk/python" so it's hard to tell which version of python you're looking at).
- Update Hacking wiki with new run-tests options ("--skip python2")
Updated by Tom Clegg over 7 years ago
- raises an exception if files are opened in text mode using Python 3.
For the sake of minimizing merge conflicts, I think it makes sense to get this branch ("arvados module can be built, installed with setuptools, and tested in Python 3") reviewed/merged, then tackle deb/rpm packaging in a separate branch.
Updated by Lucas Di Pentima over 7 years ago
After lots of reading, this is my compiled list of things that may need a second look:
sdk/python/arvados/cache.py:42
- Should that beexcept (IOError, OSError):
?sdk/python/tests/test_events.py:38
- Double parenthesissdk/python/arvados/arvfile.py
- L593:
list(some_dict.values())
is said to be inefficient on Py2 (http://python-future.org/compatible_idioms.html), it seems that an efficient way to do it is:from future.utils import listvalues listvalues(some_dict)
- L706: Same as L593, but with
listitems()
- L593:
sdk/python/arvados/collection.py
- Lines 430, 684, 727, 732, 737, 770, 1128: Py2 inefficient use of
list(dict.values())
or similar - L695: Unnecessary conversion:
iter(list(iter))
- Lines 430, 684, 727, 732, 737, 770, 1128: Py2 inefficient use of
sdk/python/arvados/stream.py
- Lines 66, 101: Py2 inefficient use of
list()
- Lines 66, 101: Py2 inefficient use of
sdk/python/arvados/commands/arv_copy.py
- L158: Unnecessary use of
list()
- L158: Unnecessary use of
sdk/python/arvados/commands/put.py
- Lines 511, 703, 780, 984: Unnecessary use of
list()
for iterations
- Lines 511, 703, 780, 984: Unnecessary use of
sdk/python/tests/run_test_server.py:740
- Unnecessary use oflist()
sdk/python/tests/test_cache.py
Lines 45-47 commented code lines.
Updated by Tom Clegg over 7 years ago
Changed most of the list(...) stuff to viewkeys(), listitems(), etc.
sdk/python/tests/run_test_server.py:740
- Unnecessary use oflist()
Not unnecessary in this case -- without it, we get "dictionary changed size during iteration".
sdk/python/tests/test_cache.py
Lines 45-47 commented code lines.
Whoops, that was some half-done debugging and was preventing the test from failing on exceptions! Fixed.
11308-python3 @ f6b16b5ae75469167dbea08d69692d5f53dd9c84
Updated by Lucas Di Pentima over 7 years ago
sdk/python/tests/run_test_server.py:3
-viewkeys
is not used, is the import needed?- Why the use of
viewkeys()
? Supposedly on py2 and py3 we can iterate over dict keys usingfor key in a_dict: ....
Other than that, lgtm.
Updated by Tom Clegg over 7 years ago
- Target version changed from 2017-04-12 sprint to 2017-04-26 sprint
Updated by Tom Clegg over 7 years ago
Updated by Tom Clegg over 7 years ago
11308-python3 @ 3a0d849c08f750dca1d6a40153c0107001769c6d https://ci.curoverse.com/job/developer-run-tests/244/
This fixes a bug revealed by merging the recent truncate() tests: in append modes, the file pointer was not updated after a write.
Updated by Lucas Di Pentima over 7 years ago
Tom: The write in append mode update LGTM.
Updated by Tom Morris over 7 years ago
- Target version changed from 2017-04-26 sprint to 2017-05-10 sprint
- Story points changed from 0.5 to 0.0
Updated by Tom Clegg over 7 years ago
Updated by Peter Amstutz over 7 years ago
API incompatibility:
Collection.keys(), Collection.values() and Collection.items() return lists in Python 2.7. However in Python 3 they return dict_keys, dict_values and dict_items. Unfortunately there is code that uses Collection.keys()[0]
and Collection.value()[0]
as a way to get the item of a single-entry dict, and this is doesn't work in Python 3 because the dict_* types don't support indexing.
The Collection methods should use an explicit cast to list().
Updated by Tom Clegg over 7 years ago
11308-collection-keys @ 47f29d75759db7e4d81203e110257f1a471f82e7
Updated by Peter Amstutz over 7 years ago
11308-collection-keys LGTM, tested with arvados-cwl-runner
Updated by Tom Clegg over 7 years ago
- Status changed from In Progress to Resolved
Packaging moved to #11657
Updated by Tom Morris over 6 years ago
- Related to Idea #13306: arvados-cwl-runner supports Python 3 added
Updated by Tom Morris over 6 years ago
- Related to Idea #11419: [SDKs] support text-mode open() in Python 3 added
Updated by Nico César about 6 years ago
- Related to Bug #14326: Our custom-compiled `python-future` and `python3-future` packages can't be installed together and have precedence added