Project

General

Profile

Actions

Task #20713

closed

Bug #20343: Regular expressions written as plain (not raw) strings

Review 20343-regexp-prefixes

Added by Peter Amstutz 8 months ago. Updated 7 months ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Start date:
07/06/2023
Due date:
% Done:

0%

Estimated time:
Actions #1

Updated by Tom Clegg 8 months ago

  • Assigned To set to Tom Clegg
Actions #2

Updated by Brett Smith 8 months ago

  • Assigned To changed from Tom Clegg to Alex Coleman
Actions #3

Updated by Brett Smith 8 months ago

  • Start date set to 07/06/2023
  • Status changed from New to In Progress
  • Subject changed from Review to Review 20343-regexp-prefixes
Actions #4

Updated by Alex Coleman 7 months ago

Overall I thought the changes made sense and increased readability. One question I had was about file

sdk/python/arvados/_normalize_stream.py
. You replace the two lines:

path = re.sub('\\\\', lambda m: '\\134', path)
path = re.sub('[:\000-\040]', lambda m: "\\%03o" % ord(m.group(0)), path)

with the single line:

re.sub(r'[\\:\000-\040]', lambda m: "\\%03o" % ord(m.group(0)), path)

so the logic of the first regex is no longer present. I just wanted to double check that removing that logic didn't have any unintended impacts.

I looked at commit 6daa187dccdc8d7513417d5122fd145661647617.

Actions #5

Updated by Alex Coleman 7 months ago

  • Assigned To changed from Alex Coleman to Brett Smith
Actions #6

Updated by Brett Smith 7 months ago

Alex Coleman wrote in #note-4:

the logic of the first regex is no longer present. I just wanted to double check that removing that logic didn't have any unintended impacts.

I think this is probably the hairiest part of the diff, but the logic hasn't been removed, it's just been merged into a single call.

In the pattern, \\ has been added to the character group, so it now matches backslashes as the original first regexp did.

And when you call the replacement function with a backslash match, it returns the same result that was hardcoded in the original:

>>> '\\%03o' % ord('\\')
'\\134'

So the results are the same, we just get them with one regexp instead of two.

Actions #7

Updated by Brett Smith 7 months ago

  • Assigned To changed from Brett Smith to Alex Coleman
Actions #8

Updated by Alex Coleman 7 months ago

Thanks, that makes sense. I just wanted to double-check. In that case, it looks good to me to merge.

Actions #9

Updated by Alex Coleman 7 months ago

  • Assigned To changed from Alex Coleman to Brett Smith
Actions #10

Updated by Brett Smith 7 months ago

  • Remaining (hours) set to 0.0
  • Status changed from In Progress to Resolved
Actions

Also available in: Atom PDF