Bug #7228
Updated by Brett Smith over 9 years ago
h2. Problem We have seen crunch-dispatch break API server as follows: * Run as root, as described in docs * Call some part of the API server's code base that uses Rails.cache * Create files and directories in @{Rails.root}/tmp/cache@ with owner=root and permissions that prohibit www-data from writing After this has happened, API server (running as www-data) crashes when trying to update cached data. This isn't very common because API server usually creates/updates a given cache item before crunch-dispatch does. But when it does happen, it's bad: for example, new groups can't be created because the group cache can't be updated. The condition can be fixed temporarily by running @arvados-api-server-upgrade.sh@ (it does @chown -R@ on the tmp dir, among other things). However, this doesn't prevent it from happening again. The real solution is #5162: refactor crunch-dispatch as an API client so it can't touch the API server Rails project at all. h2. Ideas In the meantime, there might be an effective workaround, like running crunch-dispatch with umask=002 and the same GID as the API server process. (Running crunch-dispatch with the same _UID_ as the API server process would fix the cache permission issue, but at the cost of introducing other problems: crunch-dispatch needs to use sudo, and giving www-data passwordless sudo undermines the security benefit of running the web service as non-root in the first place.) h2. Immediate fix * @arvados-api-server-upgrade.sh@ already makes $WWW_OWNER the owner of @tmp/@ recursively. Extend it to chmod @tmp/cache/@ 2775. * Extend crunch-dispatch to run with a 002 umask. The only other file it opens is its own lockfile, and it sets a specific 0644 mode for that, so this should only affect Rails cache files. * Test using the procedure in note-5. * Make sure the arvados-dev branch gets merged before the arvados branch, so we build a new package that includes both the new upgrade script and the new crunch-dispatch.