Bug #16425

[keepstore] should not scan directories it doesn't write in

Added by Ward Vandewege 21 days ago. Updated 13 days ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:
Story points:
-

Description

Keepstore walks all directories in its mount points, even those it never writes in. This can lead to a situation where keepstore does tons of unnecessary IO (e.g. when .snapshot directories are present, where GPFS makes daily snapshots).

Adding an option to exclude certain directories from what Keep considers would be a good solution.

History

#1 Updated by Ward Vandewege 21 days ago

  • Description updated (diff)

#2 Updated by Ward Vandewege 21 days ago

  • Description updated (diff)

#3 Updated by Tom Clegg 13 days ago

When indexing, it looks like we're already skipping dirs that don't conform to keepstore's storage layout: source:services/keepstore/unix_volume.go#L355

var blockDirRe = regexp.MustCompile(`^[0-9a-f]+$`)
var blockFileRe = regexp.MustCompile(`^[0-9a-f]{32}$`)
...
                if !blockDirRe.MatchString(names[0]) {
                        continue
                }
                ...
                blockdir, err := v.os.Open(blockdirpath)

However, the "empty trash" goroutine does walk the entire tree.

        err := filepath.Walk(v.Root, func(path string, info os.FileInfo, err error) error {
                if err != nil {
                        v.logger.WithError(err).Errorf("EmptyTrash: filepath.Walk(%q) failed", path)
                        return nil
                }
                todo <- dirent{path, info}
                return nil
        })

That walk func should check dirnames and return filepath.SkipDir when appropriate.

Also available in: Atom PDF