Project

General

Profile

Message queue » History » Version 10

Tom Clegg, 11/01/2017 09:20 PM

1 1 Peter Amstutz
h1. Message queue
2
3 10 Tom Clegg
Arvados needs a (better) message-passing facility for system components to use.
4
5
h2. Motivation / background
6
7
Sometimes Arvados clients and system components need to pass messages to one another. Often, REST makes sense. But in some cases (e.g., it's not practical for the sender to know the message destination(s), or it's not efficient for the sender to initiate an HTTP request to each destination) something more like pubsub is needed.
8
9
Currently (2017) we use PostgreSQL notify/listen as a pubsub device. This is very inefficient, primarily because Postgresql messages are very short. We pass longer messages by inserting the message to a "logs" table on disk, and sending the resulting ID to subscribers, who then select the real message. Eventually (depending on server config) we delete old rows.
10
11
h2. Components/features that need a message queue
12
13
* showing live container logs in workbench/terminal
14
* crunch stderr logs
15
* system logs
16
* container added to queue → noticed by dispatch
17
* container cancelled → noticed by dispatch
18
* container state/progress has changed, workbench should update page
19
* container finished, here is the output (maybe REST API is still best for this)
20
* get current state of object and notify me when it changes relative to that
21
* arvados-ws (“cache invalidation”) -- workbench, fuse
22
* (future) "latest state of object" service, similar to arvados-ws but more convenient for the client
23
* permission graph is updated (?)
24
25
h2. Desired capabilities of a message queue
26 7 Peter Amstutz
27
* Support for WebSocket transport to enable use by browser
28
* Client support for languages used in Arvados (Go, Python, Ruby, Javascript)
29
* Can apply Arvados permission model
30
* Pub/sub of live change events on records
31
** Transactional subscribe request that returns most recent record + subscribes to subsequent changes
32
* Pub/sub of live Container logs
33
** Transactional subscribe request that returns recent recent log history + subscribes to subsequent logs
34
** In-order delivery of logs
35
* Handle temporary disconnects
36 8 Peter Amstutz
* Easy to deploy (fits into our stack)
37 9 Peter Amstutz
* Scales to at least 1000 streams without trouble
38 7 Peter Amstutz
39 1 Peter Amstutz
Investigate WAMP:
40
41
http://wamp-proto.org/
42
43 2 Peter Amstutz
It uses WebSockets as its principal transport, and has client library implementations for the languages we care about (Go, Python, Javascript)
44 1 Peter Amstutz
45
https://github.com/gammazero/nexus
46 2 Peter Amstutz
47
https://github.com/crossbario/autobahn-python
48
49
https://github.com/crossbario/autobahn-js
50 1 Peter Amstutz
51
Probably what we want to do is run a "WAMP router" that sits in the middle of the log producers, the logging microservice, and log listeners (browser).