Support #21995
closed
Prototype log aggregation using Grafana Loki
Added by Peter Amstutz 4 months ago.
Updated 2 months ago.
Description
It would be very helpful to have centralized, searchable logging.
After a lengthy discussion on 2024-07-10, we identified Grafana Loki as a candidate.
For this task, we'd like to deploy Loki on a dev cluster (probably tordo) to get some operational insight on setup, configuration, performance, and how it is like to use it.
Use Grafana Alloy (OpenTelemetry Collector) to collect logs and send them to the Loki service. Configure Grafana so we can view, browse and search logs from Loki.
Files
- Description updated (diff)
- Target version changed from Development 2024-08-07 sprint to Development 2024-08-28 sprint
- Assigned To set to Lucas Di Pentima
- Status changed from New to In Progress
- Target version changed from Development 2024-08-28 sprint to Development 2024-09-11 sprint
Here are my notes about what I think is important to know about Loki & Alloy.
- Loki
- Authentication: Loki doesn't support any auth, it needs to be implemented on a reverse proxy like nginx.
- Data storage
- Only indexes metadata attached to the log entries
- Logs are just compressed and saved in object stores like S3, GCS, filesystem and others.
- Both index and logs are stored in a single store since Loki 2.0, current recommendation is to use TSDB. BoltDB is deprecated.
- Configuration allows for seamless format and storage changes, supporting multiple configs per date range.
- Data retention
- By default Loki stores everything under
/tmp/loki
so it doesn't survive reboots.
- Also, per documentation it seems that it doesn't delete old data by default, if set to a persistent filesystem.
- For object storages, the
limits_config.retention_period
config should be set to less than the object store lifecycle settings.
- Alloy
- Could allow us to replace several exporters, in addition to ingesting logs
- Configuration is expressed in a series of pipelines that take data from a source, optionally processes it and then sends it to its final detination (being prometheus or loki)
- Logs ingestion
- Can monitor files and journal logs
- Using the
loki.source.journal
component we ingest all the journal logs from the system, but it'll need additional processing to be able to identify from which service unit every log line is or just collecting the ones we're interested in.
Also, I'm attaching example config files for both services.
- Status changed from In Progress to Resolved
Prototype is ready to be demoed. Marking this as resolved.
Also available in: Atom
PDF