Idea #6473
closed[SDKs] arv-ws supports an option to show events starting from a given time
Description
arv-ws can live tail the logs for a given pipeline or job, but it won't go back in time. If you want to see logs already written for a running job, you have to use a command like:
arv log list -f '[["object_uuid", "=", "UUID"], ["event_type", "=", "stderr"]]' | jq -r '.items | map(.properties.text) | add'
This isn't very nice. The Python SDK Websockets client has support to start fetching events from a particular point in time. Add an option to arv-ws to start getting events from a specified datetime.
Updated by Brett Smith over 9 years ago
- Target version changed from Arvados Future Sprints to 2015-07-22 sprint
Updated by Radhika Chippada over 9 years ago
- Assigned To set to Radhika Chippada
Updated by Radhika Chippada over 9 years ago
- Status changed from New to In Progress
Updated by Radhika Chippada over 9 years ago
What changed:
- Added additional start_time argument to arv-ws. Expected format is listed as "YYYY-MM-DD hh:mm:ss". Should it be listed as "%Y-%m-%d %H:%M:%S" instead?
- Added test-arv-ws.py with the desire to test the arguments, at a minimum.
- One test with an unknown argument "-x" fails as expected.
- However, a test with valid arguments to the run_ws method results in subscribe and then starts to waits and blocks ('-s=2015-07-15'). Hence, I omitted such a test for the time being. Any suggestions as to how to create such a test?
Updated by Tom Clegg over 9 years ago
I think the code, the flag description, and the feature we wanted are three different things.
Flag description: "Arvados query filter to apply to log events created after this time."- This sounds to me like "give me events matching filters A,B,C that happen before/at time T; plus events matching filters A,B,C,D,E that happen after time T."
- Convenience flag, shorthand for
--filters '[["created_at",">","...."]]'
- Give me events matching A,B,C that happen after time T -- regardless of whether they were logged before or after my websocket connection is established. In other words, give me (right away) all past log events (between T and now) matching filters A,B,C; and then feed me future events matching filters A,B,C as they happen.
Apparently (at least) one of us is misreading the story. Can we clarify?
Updated by Brett Smith over 9 years ago
Tom Clegg wrote:
My understanding of the feature we want:
- Give me events matching A,B,C that happen after time T -- regardless of whether they were logged before or after my websocket connection is established. In other words, give me (right away) all past log events (between T and now) matching filters A,B,C; and then feed me future events matching filters A,B,C as they happen.
This description matches the desired functionality, yes.
My comment about the SDK currently supporting this may be… uhhhh… wrong. I think I thought that's what PollClient's poll_time
argument was for, but that's not right, and besides EventClient doesn't have a corresponding argument. One way to bridge the gap may be to write a small function that finds the first log with/after a given timestamp. EventClient could use that to seed last_log_id
, and PollClient could poll from there.
But that's just another idea off the top of my head. The point is, the functional requirements are as you described.
Updated by Radhika Chippada over 9 years ago
Branch 6473-fetch-events-starting-at at 1c6515a4445e9fd79961c9c8caf21528f16d9399
- The subscribe method now offers a "last_log_id" parameter
- Tests are updated to create an object before invoking the subscribe method and verify that that old object is also fetched when listening for objects in the past.
Updated by Peter Amstutz over 9 years ago
- Changes to EventClient look good.
- You need to implement the same last_log_id feature in PollClient. Please extend the tests so they test both EventClient and PollClient.
- The test "start time today" seems like it would fail if it runs right when the clock ticks over midnight?
Updated by Radhika Chippada over 9 years ago
17e8c59deca04137f66ece7837f96599ff82f4de
- Enhanced PollClient to support last_log_id and added additional tests.
- Updated the said test.
Thanks.
Updated by Peter Amstutz over 9 years ago
I got an error:
$ arv-ws 2015-07-21 11:31:00 arvados.arv-ws[16745] ERROR: 'Namespace' object has no attribute 'last_log_id'
It is because of this code:
if args.start_time: args.last_log_id = 1 filters += [ ['created_at', '>=', args.start_time] ]
The problem is that args.last_log_id
isn't declared on the "args" object. Either it shouldn't be attached to args
(just use last_log_id
as a local variable) or you should add another option to the parser:
parser.add_argument('--last-log-id', type=int, help="Log event counter to start at.")
Also
parser.add_argument('-s', '--start_time', ...)
The '--start_time'
should be '--start-time'
to follow convention. To access the value in Python it gets transformed by argparse into args.start_time
.
Updated by Radhika Chippada over 9 years ago
- last_log_id : corrected the issue and now using last_log_id
- fixed the input argument handling
Thanks.
Updated by Peter Amstutz over 9 years ago
Different error this time:
$ arv-ws 2015-07-21 13:35:14 arvados.arv-ws[2151] ERROR: local variable 'last_log_id' referenced before assignment
Updated by Radhika Chippada over 9 years ago
Oops. Updated (added an else block to set last_log_id to None as needed). Also tested manually with and without -s. Thanks.
Updated by Radhika Chippada over 9 years ago
Expanded start_time input argument description to say the time is server time in UTC, per Peter's suggestion.
Updated by Radhika Chippada over 9 years ago
- Status changed from In Progress to Resolved
Applied in changeset arvados|commit:dd18605e7fe7c537dc84a3e3174d18ec0320c95a.