Project

General

Profile

Actions

Bug #19851

closed

Log panel not loading all messages

Added by Peter Amstutz about 2 months ago. Updated about 2 months ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Workbench2
Target version:
Start date:
12/13/2022
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
1.0
Release relationship:
Auto

Description

I have been seeing this a lot, the log panel doesn't load all the messages. Also, sometimes it stops adding messages while running.

Example:

https://workbench2.tordo.arvadosapi.com/processes/tordo-xvhdp-0dycyfs7cdo47di

From discussion: try to timebox this to 1 point, if the investigation / fix ends up being too complicated, revisit it.


Subtasks 1 (0 open1 closed)

Task #19852: Review 19851-log-panel-fixResolvedLucas Di Pentima12/13/2022

Actions
Actions #1

Updated by Peter Amstutz about 2 months ago

  • Description updated (diff)
Actions #2

Updated by Peter Amstutz about 2 months ago

  • Story points set to 1.0
  • Description updated (diff)
Actions #3

Updated by Lucas Di Pentima about 2 months ago

  • Assigned To set to Lucas Di Pentima
Actions #4

Updated by Lucas Di Pentima about 2 months ago

  • Status changed from New to In Progress
Actions #5

Updated by Lucas Di Pentima about 2 months ago

Updates at arvados-workbench2|039106c2 - branch 19851-log-panel-fix
Test run: developer-tests-workbench2: #1030

Take into account that this story is time-boxed to 1 point. Major refactoring may be needed for properly handling logging.

The issue was due to the fact that the container had more than 1000 log records, so all logs weren't included in the request's response.

Updates

  • After receiving the response, the code checks for remaining logs and does a second request if there's any.
  • If the remaining logs amount is greater than 1000, only ask for the last 1000 and adds a text indication that some logs were skipped.
    • The text indication only appears on "Main logs" and "All logs" filters, properly adding it to every filter would require code refactoring.

Notes/thoughts

  • Just asking for the last 1000 log entries would make the code A LOT clearer. Taking into account that this feature is for the "live logs" part of the process viewing, it would make sense, but could be confusing when the user inspects a finished process (until we implement the log collection view feature).
  • Handling 2000 log entries like we currently do might not be ideal, for performance reasons (I've seen some serious slowdown when manually testing it).
Actions #6

Updated by Tom Clegg about 2 months ago

I think these should use items.length rather than MAX_PAGE_SIZE because the API doesn't necessarily fill the requested page size (especially since MAX_PAGE_SIZE is hardcoded to 1000 in wb2 but MaxItemsPerResponse is configurable on the server side):

+    const remainingLogs = itemsAvailable - MAX_PAGE_SIZE;

...

+        if (remainingLogs > MAX_PAGE_SIZE) {

Can we set count=none on the 2nd request to save some work on the server side?

It might be even faster to set count=none on the 1st request, and always do a 2nd request using the newest 1st-page timestamp as a filter in the 2nd request, but this is already timeboxed so maybe punt.

Rest LGTM.

Longer term, wb2 should be using the log collection once the container is complete.

Actions #7

Updated by Lucas Di Pentima about 2 months ago

Updates at arvados-workbench2|fea546c
Test run: developer-tests-workbench2: #1031

  • Uses the controller's API.MaxItemsPerResponse config instead of a hardcoded value that could be bigger than what's allowed.
  • Adds count='none' to the 2nd request for better performance.

Comment regarding using count='none' on the first request: I think not knowing how many log elements we have would not allow us to detect the case where a 2nd request would be able to get all the remaining logs. Is my reasoning correct? If so, would the performance gain be worth not handling that edge case correctly by always showing the "...some logs were skipped..." line when there's more than API.MaxItemsPerResponse log entries on the table?

Actions #8

Updated by Peter Amstutz about 2 months ago

  • Release set to 47
Actions #9

Updated by Tom Clegg about 2 months ago

It might be more efficient to do 2x requests with count=none than 1x request with the default count=exact. But I don't think we need to get too excited about optimizing this right now.

Given that we're using itemsAvailable, I still think it is better to subtract items.length rather than maxPageSize to decide whether all records have been retrieved in the first request, because the API does not guarantee that min(maxPageSize, itemsAvailable) items will be returned (e.g., we have MaxIndexDatabaseRead too).

Actions #10

Updated by Lucas Di Pentima about 2 months ago

Tom Clegg wrote in #note-9:

Given that we're using itemsAvailable, I still think it is better to subtract items.length rather than maxPageSize to decide whether all records have been retrieved in the first request, because the API does not guarantee that min(maxPageSize, itemsAvailable) items will be returned (e.g., we have MaxIndexDatabaseRead too).

Thanks for clarifying this, I was not aware of it.

Updates at arvados-workbench2|8b6cc8ee
Test run: developer-tests-workbench2: #1032

  • Don't rely on API.MaxItemsPerResponse for calculating remaining log records.
Actions #11

Updated by Tom Clegg about 2 months ago

LGTM, thanks.

I had to look up const { items: itemsLast } = await ... in the typescript language reference ... and I was reassured to see two uses of the word "confusing" in the relevant section. :) https://www.typescriptlang.org/docs/handbook/variable-declarations.html#property-renaming

Actions #12

Updated by Lucas Di Pentima about 2 months ago

  • Status changed from In Progress to Resolved
Actions

Also available in: Atom PDF