Project

General

Profile

Message queue » History » Version 11

Tom Clegg, 11/02/2017 01:54 PM

1 1 Peter Amstutz
h1. Message queue
2
3 11 Tom Clegg
Arvados needs a (better) message-passing facility for internal use (e.g., container dispatch) and as a way to offer pubsub-like APIs (e.g., live container logs).
4 10 Tom Clegg
5
h2. Motivation / background
6
7
Sometimes Arvados clients and system components need to pass messages to one another. Often, REST makes sense. But in some cases (e.g., it's not practical for the sender to know the message destination(s), or it's not efficient for the sender to initiate an HTTP request to each destination) something more like pubsub is needed.
8 1 Peter Amstutz
9 11 Tom Clegg
Currently (2017) we use PostgreSQL notify/listen as a pubsub device.
10
* This is very inefficient, primarily because Postgresql messages are very short. We pass longer messages by inserting the message to a "logs" table on disk, and sending the resulting ID to subscribers, who then select the real message. Eventually (depending on server config) we delete old rows.
11
* Arvados userspace processes cannot connect directly to PostgreSQL; this approach requires an intermediary (arvados-ws) to implement the Arvados permission model.
12 10 Tom Clegg
13
h2. Components/features that need a message queue
14
15
* showing live container logs in workbench/terminal
16
* crunch stderr logs
17
* system logs
18
* container added to queue → noticed by dispatch
19
* container cancelled → noticed by dispatch
20
* container state/progress has changed, workbench should update page
21
* container finished, here is the output (maybe REST API is still best for this)
22
* get current state of object and notify me when it changes relative to that
23
* arvados-ws (“cache invalidation”) -- workbench, fuse
24
* (future) "latest state of object" service, similar to arvados-ws but more convenient for the client
25
* permission graph is updated (?)
26
27 1 Peter Amstutz
h2. Desired capabilities of a message queue
28 7 Peter Amstutz
29 1 Peter Amstutz
* Support for WebSocket transport to enable use by browser
30 11 Tom Clegg
* Good client libraries available for Go, Python, Javascript, Ruby
31 1 Peter Amstutz
* Can apply Arvados permission model
32 11 Tom Clegg
** Cancel/silence subscription if relevant permission is revoked while subscription is open
33 1 Peter Amstutz
* Pub/sub of live change events on records
34 11 Tom Clegg
** Transactional subscribe request that returns most recent record + subscribes to subsequent changes (alternatively, this might be a separate service built on the message queue + REST API)
35 1 Peter Amstutz
* Pub/sub of live Container logs
36 11 Tom Clegg
** Transactional subscribe request that returns recent recent log history + subscribes to subsequent logs (optional?) (alternatively, this might be a separate service built on the message queue + Keep, if crunch-run checkpoints logs periodically)
37 7 Peter Amstutz
** In-order delivery of logs
38 11 Tom Clegg
* Recover from transient network problems
39
** Transparently catch up on missed events
40 7 Peter Amstutz
* Easy to deploy (fits into our stack)
41 11 Tom Clegg
* Scales to thousands of topics/subscriptions per router process
42
43
h2. Implementation
44 7 Peter Amstutz
45 1 Peter Amstutz
Investigate WAMP:
46
47
http://wamp-proto.org/
48
49 2 Peter Amstutz
It uses WebSockets as its principal transport, and has client library implementations for the languages we care about (Go, Python, Javascript)
50 1 Peter Amstutz
51
https://github.com/gammazero/nexus
52 2 Peter Amstutz
53
https://github.com/crossbario/autobahn-python
54
55
https://github.com/crossbario/autobahn-js
56 1 Peter Amstutz
57
Probably what we want to do is run a "WAMP router" that sits in the middle of the log producers, the logging microservice, and log listeners (browser).