Skip to content

perf: avoid eager fingerprint variable evaluation#2883

Draft
Napolitain wants to merge 5 commits into
go-task:mainfrom
Napolitain:issue-2853-lazy-fingerprint-vars
Draft

perf: avoid eager fingerprint variable evaluation#2883
Napolitain wants to merge 5 commits into
go-task:mainfrom
Napolitain:issue-2853-lazy-fingerprint-vars

Conversation

@Napolitain

@Napolitain Napolitain commented Jun 18, 2026

Copy link
Copy Markdown

This is a stacked optimization PR on top of the filesystem benchmark work in #2881. Please review or merge #2881 first; this branch intentionally depends on those benchmarks so the performance impact can be evaluated with the same many-small and few-large fixtures. In this way we can also easily revert a commit if needed, without affecting the test (which is naturally harmless).

That PR avoids duplicate fingerprint evaluation during task compilation

(I think, if I put the base as my branch, then I can't switch of origin)

BenchmarkManySmallFiles/checksum-28                    3         139978116 ns/op           0.71 MB/s             0.09537 source_MiB/op     20000 source_files/op      679207408 B/op    248202 allocs/op
BenchmarkManySmallFiles/timestamp-28                   3          42650460 ns/op           2.34 MB/s             0.09537 source_MiB/op     20000 source_files/op      25159722 B/op     188222 allocs/op
BenchmarkManySmallFiles/native-mtime-28                3          18077746 ns/op           5.53 MB/s             0.09537 source_MiB/op     20000 source_files/op      11558392 B/op     123036 allocs/op
BenchmarkManySmallFiles/none-28                        3            701787 ns/op         2598624 B/op       3193 allocs/op

BenchmarkFewLargeFiles/checksum-28                     3          19877230 ns/op        27009.34 MB/s          512.0 source_MiB/op            4.000 source_files/op     517832 B/op       1717 allocs/op
BenchmarkFewLargeFiles/timestamp-28                    3            147128 ns/op        3648997.44 MB/s        512.0 source_MiB/op            4.000 source_files/op     261288 B/op       1730 allocs/op
BenchmarkFewLargeFiles/native-mtime-28                 3             17511 ns/op        30659066.42 MB/s               512.0 source_MiB/op             4.000 source_files/op      4701 B/op         44 allocs/op
BenchmarkFewLargeFiles/none-28                         3            967514 ns/op         2599981 B/op       3193 allocs/op

Add Issue go-task#2853 benchmarks comparing checksum, timestamp, and uncached tasks across many-small and few-large sparse YAML source sets.

Baseline on Intel i7-14700K, go test -run '^$' -bench 'BenchmarkIssue2853.*SparseYAMLFiles' -benchtime=3x -count=3 ./

Many small sparse YAML files (20,000 x 5 bytes): checksum 440-451 ms/op, timestamp 140-148 ms/op, none 1.1-1.3 ms/op.

Few large sparse YAML files (4 x 128 MiB): checksum 60-61 ms/op, timestamp 213-239 us/op, none 1.1-1.3 ms/op.

Sparse files avoid bulk data writes while preserving logical file size for checksum/timestamp comparisons.
Add an OS-native mtime reference point for the Issue go-task#2853 filesystem benchmarks. The reference walks the same sparse YAML source tree with filepath.WalkDir, stats YAML files through DirEntry.Info, and compares mtimes against a generated output file.

The benchmark is available under the fsbench build tag alongside the Task checksum, timestamp, and uncached cases.
Rename the fsbench benchmark entry points from Issue-2853-specific names to BenchmarkManySmallFiles and BenchmarkFewLargeFiles.

The benchmark output is now easier to scan while the PR and commit history still carry the issue context. Helper and constant names were updated to match; benchmark behavior is unchanged.
Only compute CHECKSUM or TIMESTAMP template variables when the raw task references the corresponding variable in commands, deps, preconditions, or status. This keeps the up-to-date source check unchanged while avoiding a duplicate fingerprint pass for tasks that only use sources/generates caching.

Many-small benchmark before this branch was about 420 ms/op checksum and 137 ms/op timestamp for 20,000 tiny files. After this change, repeated local runs measured about 144-148 ms/op checksum and 46-48 ms/op timestamp, with allocs dropping from about 742k to 248k for checksum and from about 562k to 188k for timestamp.

Verification: go test ./...; go test -tags fsbench -run '^$' -bench 'BenchmarkManySmallFiles/(checksum|timestamp)$' -benchtime=5x -count=3 -benchmem ./
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant