Story #14939

Update crunchstat-summary to Python 3

Added by Tom Morris 8 months ago. Updated 6 months ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Start date:
03/08/2019
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
1.0
Release relationship:
Auto

Subtasks

Task #14940: Review 14939-crunchstat-summary-python3 -> 14939-crunchstat-summary-python3ResolvedTom Morris

Task #15127: Review 14939-crunchstat-summary-python3-fixupClosedTom Morris


Related issues

Blocks Arvados - Story #14532: [Epic] Port to Python 3 to prepare for Python 2 sunsetting in December 2019In Progress

Associated revisions

Revision 5d77a55a (diff)
Added by Tom Morris 8 months ago

Update crunchstat-summary to Python 3. Refs #14939

Arvados-DCO-1.1-Signed-off-by: Tom Morris <>

Revision cd33e7da (diff)
Added by Tom Morris 8 months ago

Make iterator Python 2 compatible. refs #14939

Arvados-DCO-1.1-Signed-off-by: Tom Morris <>

Revision 22ed89be (diff)
Added by Tom Morris 8 months ago

Make log file reading Python 3 compatible. refs #14939

Arvados-DCO-1.1-Signed-off-by: Tom Morris <>

Revision 031a3005 (diff)
Added by Tom Morris 8 months ago

Fix test file handling to support Python 3. refs #14939

Arvados-DCO-1.1-Signed-off-by: Tom Morris <>

Revision 03fd14e6 (diff)
Added by Tom Morris 8 months ago

Add crunchstat-summary to Python 3 tests. refs #14939

Arvados-DCO-1.1-Signed-off-by: Tom Morris <>

Revision 6bac55d2 (diff)
Added by Tom Morris 8 months ago

Update crunchstat-summary to Python 3. Refs #14939

Arvados-DCO-1.1-Signed-off-by: Tom Morris <>

Revision c1589536 (diff)
Added by Tom Morris 8 months ago

Add crunchstat-summary to Python 3 tests. refs #14939

Arvados-DCO-1.1-Signed-off-by: Tom Morris <>

Revision 8a0bc9d2 (diff)
Added by Tom Morris 8 months ago

Update crunchstat-summary to Python 3. Refs #14939

Arvados-DCO-1.1-Signed-off-by: Tom Morris <>

Revision fdaf820a (diff)
Added by Tom Morris 8 months ago

Add crunchstat-summary to Python 3 tests. refs #14939

Arvados-DCO-1.1-Signed-off-by: Tom Morris <>

Revision 7c3413ae (diff)
Added by Tom Morris 8 months ago

Update crunchstat-summary to Python 3. Refs #14939

Arvados-DCO-1.1-Signed-off-by: Tom Morris <>

Revision 77c137e8 (diff)
Added by Tom Morris 8 months ago

Add crunchstat-summary to Python 3 tests. refs #14939

Arvados-DCO-1.1-Signed-off-by: Tom Morris <>

Revision 44aa8df7 (diff)
Added by Tom Morris 8 months ago

Update crunchstat-summary to Python 3. Refs #14939

Arvados-DCO-1.1-Signed-off-by: Tom Morris <>

Revision 0b2a6fed (diff)
Added by Tom Morris 8 months ago

Update crunchstat-summary to Python 3. Refs #14939

Arvados-DCO-1.1-Signed-off-by: Tom Morris <>

Revision d0dc4cc1 (diff)
Added by Tom Morris 8 months ago

Add crunchstat-summary to Python 3 tests. refs #14939

Arvados-DCO-1.1-Signed-off-by: Tom Morris <>

Revision 00592ebe
Added by Tom Morris 8 months ago

Merge branch '14939-crunchstat-summary-python3'

Fixes #14939

Arvados-DCO-1.1-Signed-off-by: Tom Morris <>

Revision 5950f909 (diff)
Added by Tom Clegg 8 months ago

Accept :py3 modifier for any Python suite in interactive mode.

(even ones that aren't normally run with python3 by jenkins)

refs #14939

Arvados-DCO-1.1-Signed-off-by: Tom Clegg <>

Revision dbf8c571 (diff)
Added by Tom Morris 8 months ago

Add crunchstat-summary Python 3 to all tests.

Refs #14939

Arvados-DCO-1.1-Signed-off-by: Tom Morris <>

Revision ae232084 (diff)
Added by Tom Morris 7 months ago

Add crunchstat-summary Python 3 to all tests.

Refs #14939

Arvados-DCO-1.1-Signed-off-by: Tom Morris <>

Revision 6f3c2bed (diff)
Added by Tom Morris 7 months ago

Update crunchstat-summary to Python 3. Refs #14939

Arvados-DCO-1.1-Signed-off-by: Tom Morris <>

Revision 8c345b8d (diff)
Added by Tom Morris 7 months ago

Add crunchstat-summary Python 3 to all tests.

Refs #14939

Arvados-DCO-1.1-Signed-off-by: Tom Morris <>

Revision cf8a4757 (diff)
Added by Tom Morris 7 months ago

Add crunchstat-summary Python 3 to all tests.

Refs #14939

Arvados-DCO-1.1-Signed-off-by: Tom Morris <>

Revision 5254ba76 (diff)
Added by Tom Morris 7 months ago

Fix iteration order dependency.

Causes tests to fail in Python 3.5, but not Python 3.7 due to dependency on iteration order of dicts.

Refs #14939

Arvados-DCO-1.1-Signed-off-by: Tom Morris <>

Revision 2266cd0b (diff)
Added by Tom Morris 7 months ago

Add crunchstat-summary Python 3 to all tests.

Refs #14939

Arvados-DCO-1.1-Signed-off-by: Tom Morris <>

Revision 6c1b0fb3 (diff)
Added by Tom Morris 7 months ago

Fix iteration order dependency.

Causes tests to fail in Python 3.5, but not Python 3.7 due to dependency on iteration order of dicts.

Refs #14939

Arvados-DCO-1.1-Signed-off-by: Tom Morris <>

Revision ed35acd2
Added by Tom Morris 7 months ago

Merge branch '14939-crunchstat-summary-python3-fixup'

Fixes #14939

Arvados-DCO-1.1-Signed-off-by: Tom Morris <>

History

#1 Updated by Tom Morris 8 months ago

  • Status changed from New to In Progress
  • Start date set to 03/08/2019

#2 Updated by Tom Morris 8 months ago

  • Parent task deleted (#14532)

#3 Updated by Tom Morris 8 months ago

  • Tracker changed from Task to Story

#4 Updated by Tom Morris 8 months ago

  • Blocks Story #14532: [Epic] Port to Python 3 to prepare for Python 2 sunsetting in December 2019 added

#5 Updated by Tom Clegg 8 months ago

LiveLogReader threads are crashing in the test suite, which seems to indicate two broken things:
  • Crashing threads don't cause the tests to fail, because it's Python and we don't explicitly catch exceptions in the thread and check/propagate them after calling join()... I guess we need to do that.
  • LiveLogReader.next() is still needed by Python 2

Should update run-tests.sh to test crunchstat-summary with Python 3 as well (add "tools/crunchstat-summary:py3" to pythonstuff like "sdk/cwl:py3"). It looks like there are some bytes-vs.-str things to sort out.

#6 Updated by Tom Morris 8 months ago

  • Target version changed from Arvados Future Sprints to 2019-03-13 Sprint

#7 Updated by Tom Morris 8 months ago

  • Target version changed from 2019-03-13 Sprint to 2019-03-27 Sprint

#8 Updated by Tom Morris 8 months ago

Tom Clegg wrote:

LiveLogReader threads are crashing in the test suite, which seems to indicate two broken things:
  • Crashing threads don't cause the tests to fail, because it's Python and we don't explicitly catch exceptions in the thread and check/propagate them after calling join()... I guess we need to do that.
  • LiveLogReader.next() is still needed by Python 2

Fixed the iterator and other thread crashers, but not the general thread exception catching.

Should update run-tests.sh to test crunchstat-summary with Python 3 as well (add "tools/crunchstat-summary:py3" to pythonstuff like "sdk/cwl:py3"). It looks like there are some bytes-vs.-str things to sort out.

Done.

Also fixed Python 3 log file reading, test failure caused by integer division diffs, and a couple of other things.

All tests pass locally in both Python 2 & 3 and CI job running here: https://ci.curoverse.com/view/Developer/job/developer-run-tests/1122/

#9 Updated by Tom Clegg 8 months ago

Could tests/test_examples.py use crunchstat_summary.command.GzipText instead of having its own copy of it?

Nit: usually kwargs are foo=bar but there are a couple of foo = bar here. meh

The rest LGTM, thanks.

#10 Updated by Tom Morris 8 months ago

I force pushed a new branch 14939-crunchstat-summary-python3 at 6cf111041f450c5d33665615ac915869c4279f9f which addresses your two comments as well as fixes:
  • resource leak warned by Python 3 (only)
  • deprecation warnings in Python 3

I also eliminated some unnecessary u'' additions and squashed some of the minor commits together and gave the UTF8 decoding wrapper class a better name.

If you could take a quick look to make sure it looks better, not worse, that'd be great.

#11 Updated by Tom Morris 8 months ago

I've put up another branch based on this one with a number of small fixes for #12026, #10570, & #14451 which all have Eric as the reviewer. Please coordinate with him to see if it makes more sense for one person to review all the changes as a batch.

#12 Updated by Eric Biagiotti 8 months ago

Calling .items() to iterate on a dict is inefficient in Python 2 according to the following: http://python-future.org/compatible_idioms.html#iterating-through-dict-keys-values-items. It recommends using viewitems(), though this will require a future.utils import. When working on the ACR, I ran into packaging problems after adding the future import. If you add it, its probably worth running the package building and tests locally.

Also a nit about an extra line of whitespace here: https://dev.arvados.org/projects/arvados/repository/revisions/14939-crunchstat-summary-python3/entry/tools/crunchstat-summary/tests/test_examples.py#L63

Looks like this change from cpu to cp was unintentional: https://dev.arvados.org/projects/arvados/repository/revisions/14939-crunchstat-summary-python3/entry/tools/crunchstat-summary/crunchstat_summary/webchart.py#L48

Also, do we use crunchstat-summary output to run cwl integration/conformance tests at all? If so, its worth running those as well.

#13 Updated by Tom Morris 8 months ago

Eric Biagiotti wrote:

Calling .items() to iterate on a dict is inefficient in Python 2 according to the following: http://python-future.org/compatible_idioms.html#iterating-through-dict-keys-values-items. It recommends using viewitems(), though this will require a future.utils import. When working on the ACR, I ran into packaging problems after adding the future import. If you add it, its probably worth running the package building and tests locally.

I think I'm willing to live with a few months of Python 2 only (unquantified) inefficiency in a non-performance critical section of code.

Also a nit about an extra line of whitespace here: https://dev.arvados.org/projects/arvados/repository/revisions/14939-crunchstat-summary-python3/entry/tools/crunchstat-summary/tests/test_examples.py#L63

Fixed.

Looks like this change from cpu to cp was unintentional: https://dev.arvados.org/projects/arvados/repository/revisions/14939-crunchstat-summary-python3/entry/tools/crunchstat-summary/crunchstat_summary/webchart.py#L48

Oops! Good catch. Fixed.

Also, do we use crunchstat-summary output to run cwl integration/conformance tests at all? If so, its worth running those as well.

No. I'm pretty sure there are no interactions with other components.

#14 Updated by Tom Morris 8 months ago

  • Status changed from In Progress to Resolved
  • % Done changed from 0 to 100

#15 Updated by Tom Morris 8 months ago

  • Status changed from Resolved to In Progress
  • Target version changed from 2019-03-27 Sprint to 2019-04-10 Sprint

#16 Updated by Tom Morris 7 months ago

  • Target version changed from 2019-04-10 Sprint to 2019-04-24 Sprint

#17 Updated by Tom Morris 7 months ago

  • Story points set to 1.0

Tom noticed that I failed to include this in the Python 3 tests (each component needs to be added individually). Adding it caused it to fail in CI because there was a latent bug in the code that was dependent on dictionary iteration order. It worked fine on Python 2.7 & 3.7, but failed on Python 3.5.

CI build at https://ci.curoverse.com/view/Developer/job/developer-run-tests/1201/
(successful except for failing fuse tests)

#18 Updated by Eric Biagiotti 7 months ago

I'm trying to test locally (run crunchstat-summary with the test data) in a 3.5.7 virtualenv, and the arvados-python-client fails installation because of the ciso8601 dependency. Is there a specific way I should be testing this?

#19 Updated by Tom Morris 7 months ago

I don't think I did anything special to set up my conda environment, but it doesn't have any specific arvados-api-client dependency, so you can use whatever relatively recent version that works for you.

Here's what I'm testing with:

python                    3.5.5                h5001a0f_2    conda-forge
arvados-python-client     1.3.1.20190308210512           <pip>
ciso8601                  2.1.1                     <pip>

I can't tell what versions are in use on Jenkins, but it runs fine there too, so perhaps replicating that environment is the best way to go.

#20 Updated by Eric Biagiotti 7 months ago

If our code is relying on a dict being ordered, we should probably just use an OrderedDict.

From cf8a475732f7a62108704b6d7803400681a69eba, changing line 187 in summarizer.py to the following got tests passing in 2.7, 3.5, and 3.7

stats = collections.OrderedDict()

Since python 3.7, dicts are officially ordered, but it is still recommended to use OrderedDict if we are supporting multiple python versions.

#21 Updated by Eric Biagiotti 7 months ago

As discussed in chat, there is no real need to require the dict to be ordered in this case, and the changes in 5254ba765ef759bb59584d7304ec22b77238772c remove the order requirement. Subsequently, this LGTM. Thanks!

#22 Updated by Tom Morris 7 months ago

  • Status changed from In Progress to Resolved
  • % Done changed from 50 to 100

#23 Updated by Tom Morris 6 months ago

  • Release set to 15

Also available in: Atom PDF