perf: avoid eager fingerprint variable evaluation#2883
Draft
Napolitain wants to merge 5 commits into
Draft
Conversation
Add Issue go-task#2853 benchmarks comparing checksum, timestamp, and uncached tasks across many-small and few-large sparse YAML source sets. Baseline on Intel i7-14700K, go test -run '^$' -bench 'BenchmarkIssue2853.*SparseYAMLFiles' -benchtime=3x -count=3 ./ Many small sparse YAML files (20,000 x 5 bytes): checksum 440-451 ms/op, timestamp 140-148 ms/op, none 1.1-1.3 ms/op. Few large sparse YAML files (4 x 128 MiB): checksum 60-61 ms/op, timestamp 213-239 us/op, none 1.1-1.3 ms/op. Sparse files avoid bulk data writes while preserving logical file size for checksum/timestamp comparisons.
Add an OS-native mtime reference point for the Issue go-task#2853 filesystem benchmarks. The reference walks the same sparse YAML source tree with filepath.WalkDir, stats YAML files through DirEntry.Info, and compares mtimes against a generated output file. The benchmark is available under the fsbench build tag alongside the Task checksum, timestamp, and uncached cases.
Rename the fsbench benchmark entry points from Issue-2853-specific names to BenchmarkManySmallFiles and BenchmarkFewLargeFiles. The benchmark output is now easier to scan while the PR and commit history still carry the issue context. Helper and constant names were updated to match; benchmark behavior is unchanged.
Only compute CHECKSUM or TIMESTAMP template variables when the raw task references the corresponding variable in commands, deps, preconditions, or status. This keeps the up-to-date source check unchanged while avoiding a duplicate fingerprint pass for tasks that only use sources/generates caching. Many-small benchmark before this branch was about 420 ms/op checksum and 137 ms/op timestamp for 20,000 tiny files. After this change, repeated local runs measured about 144-148 ms/op checksum and 46-48 ms/op timestamp, with allocs dropping from about 742k to 248k for checksum and from about 562k to 188k for timestamp. Verification: go test ./...; go test -tags fsbench -run '^$' -bench 'BenchmarkManySmallFiles/(checksum|timestamp)$' -benchtime=5x -count=3 -benchmem ./
This was referenced Jun 18, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is a stacked optimization PR on top of the filesystem benchmark work in #2881. Please review or merge #2881 first; this branch intentionally depends on those benchmarks so the performance impact can be evaluated with the same many-small and few-large fixtures. In this way we can also easily revert a commit if needed, without affecting the test (which is naturally harmless).
That PR avoids duplicate fingerprint evaluation during task compilation
(I think, if I put the base as my branch, then I can't switch of origin)