Project

General

Profile

Actions

Bug #12720

closed

systemd unit files should be compatible with older systemd versions

Added by Tom Clegg over 6 years ago. Updated over 6 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
-
Story points:
-

Description

Our systemd unit files say this:

# systemd<230
StartLimitInterval=0
# systemd>=230
StartLimitIntervalSec=0

This works on systemd≥230 (it ignores StartLimitInterval) but fails on systemd<230.

These systemd files get installed automatically by our packages, so anyone using an older systemd needs to un-break them after each package install/upgrade.

Propose using StartLimitBurst instead:

StartLimitBurst=12

This option didn't get renamed like StartLimitInterval[Sec] did, so it should work on all systems. We already specify RestartSec=1 and the default configuration is StartLimitInterval[Sec]=10 (unless the OS vendor or sysadmin has changed it of course), so StartLimitBurst=12 should prevent systemd from reaching the "stay down until manual intervention" state.

A different (less trivial) solution would be for all services to implement "pause and retry if dependencies fail" logic internally, rather than relying on systemd to supervise them. This way systemd would report the services as "alive" the whole time, and the sysadmin could no longer rely on generic systemd-based tools for logging/alerting about restarts, fwiw.

Actions #1

Updated by Nico César over 6 years ago

we should be able to work for the supported distros and their shipped systemd .

add table here

Actions #2

Updated by Tom Clegg over 6 years ago

os:version systemd version StartLimitInterval? StartLimitIntervalSec? StartLimitBurst? according to
centos:7 219-42.el7_4.4 yes - yes man systemd.service
debian:8 215-17+deb8u7 yes - yes man systemd.service
debian:9 232-25+deb9u1 - yes yes man systemd.unit
ubuntu:trusty 204-5ubuntu20.25 yes - yes man systemd.service
ubuntu:xenial 229-4ubuntu21 yes - yes man systemd.unit
Actions #3

Updated by Tom Clegg over 6 years ago

  • Status changed from New to In Progress
  • Assigned To set to Tom Clegg

The problem wasn't that the wrong spelling caused systemd to reject the service. It was just that sufficiently old versions were ignoring both spellings -- there is an even older spelling!

The solution is to include three different spellings:
  • Service → StartLimitInterval (versions up to 219, incl. ubuntu:trusty)
  • Unit → StartLimitInterval (version 229, incl. ubuntu:xenial)
  • Unit → StartLimitIntervalSec (version 230+)

(StartLimitBurst also moved from Service to Unit so it would merely create a differently messy service file, and StartLimitInterval=0 is the thing we really want, so I stuck with that.)

12720-systemd-compat @ 888b9bc108ae4297e35e6741904ac37ac68b2259

Actions #4

Updated by Tom Clegg over 6 years ago

12720-systemd-compat @ 4b9a74f8ce269ebd19b8cfa77c7ebb74df125429 (rebased & applied fix to new arvados-health unit)

Actions #5

Updated by Ward Vandewege over 6 years ago

Tom Clegg wrote:

12720-systemd-compat @ 4b9a74f8ce269ebd19b8cfa77c7ebb74df125429 (rebased & applied fix to new arvados-health unit)

LGTM, please merge.

Actions #6

Updated by Anonymous over 6 years ago

  • % Done changed from 0 to 100
  • Status changed from In Progress to Resolved
Actions

Also available in: Atom PDF