Skip to content

🐛 Serve current snapshot when fetching by timestamp after latest op#709

Merged
alecgibson merged 1 commit into
masterfrom
fix/fetch-snapshot-by-timestamp-missing-ops
Jun 23, 2026
Merged

🐛 Serve current snapshot when fetching by timestamp after latest op#709
alecgibson merged 1 commit into
masterfrom
fix/fetch-snapshot-by-timestamp-missing-ops

Conversation

@alecgibson

Copy link
Copy Markdown
Collaborator

At the moment, fetchSnapshotByTimestamp only fetches the current snapshot directly when the requested timestamp is null. For any other timestamp it rebuilds the snapshot from the milestone snapshot plus ops.

This means that if the requested timestamp is after the document's latest op (i.e. after the current snapshot's mtime), and the ops have since been deleted or TTLed away, the fetch fails with a "Missing ops" error - even though the current snapshot is intact and is exactly the snapshot we should be serving.

This change always fetches the current snapshot first, and serves it directly when the requested timestamp is after its mtime. As well as fixing the missing-ops case, this avoids replaying ops whenever the timestamp is newer than the current version, at the cost of one extra snapshot lookup in the cases that do still need ops.

🤖 Generated with Claude Code

Co-Authored-By: Claude noreply@anthropic.com

At the moment, `fetchSnapshotByTimestamp` only fetches the current
snapshot directly when the requested timestamp is `null`. For any other
timestamp it rebuilds the snapshot from the milestone snapshot plus ops.

This means that if the requested timestamp is after the document's
latest op (i.e. after the current snapshot's `mtime`), and the ops have
since been deleted or TTLed away, the fetch fails with a "Missing ops"
error - even though the current snapshot is intact and is exactly the
snapshot we should be serving.

This change always fetches the current snapshot first, and serves it
directly when the requested timestamp is after its `mtime`. As well as
fixing the missing-ops case, this avoids replaying ops whenever the
timestamp is newer than the current version, at the cost of one extra
snapshot lookup in the cases that do still need ops.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@alecgibson alecgibson marked this pull request as ready for review June 19, 2026 20:07
@alecgibson

Copy link
Copy Markdown
Collaborator Author

Build will be fixed in share/sharedb-mongo#171

Comment thread lib/backend.js
Comment on lines +892 to +899
var mtime = currentSnapshot.m && currentSnapshot.m.mtime;
var shouldGetLatestSnapshot = timestamp === null || (mtime != null && timestamp > mtime);
if (shouldGetLatestSnapshot) {
// Strip the metadata that we only fetched in order to compare the mtime,
// so that the returned snapshot is consistent with the op-replayed path.
currentSnapshot.m = null;
return callback(null, currentSnapshot);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be a performance hit? It looks like we now do an extra round trip (getSnapshot for the current version) on every timestamp request, including the cases where we still end up replaying ops.

I guess it depends on what the expected access pattern is. My intuition (numbers pulled from a hat 🎩) is that ~90% of timestamp requests are for an older point in time, where we'll want to fetch and replay the older ops anyway — so for those we now pay for the current-snapshot fetch on top of the op replay, with no benefit.

That said, I don't have a better alternative for the case this is solving (rebuilding when older ops have been TTLed), so this might just be the necessary trade-off. Mostly flagging it to check the assumption — do we have a sense of how often the requested timestamp is actually after the latest snapshot?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this is the tradeoff we're making. I think it's basically impossible to get numbers on this, since we have no idea how other consumers are using this.

My feeling is that because this is a historic snapshot fetch:

  1. You probably don't care too much about speed (fetching arbitrary numbers of ops and replaying them is already quite a slow path)
  2. Current snapshot fetch should be pretty optimized, and we already do it on every op submission

If we wanted to be super conservative about this change, I guess I could hide it behind an opt-in flag that would leave existing performance untouched, but allow users to be able to fetch snapshots in projects where ops are TTLed. We've done that in the past with sharedb mongo and strict op linking. The downside of this approach is that ShareDB won't work quite as smoothly out-of-the-box, and consumers will have to rummage through documentation to find this flag, which doesn't feel like great developer experience to me.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this over a call and we'll go ahead and release without a flag: this is a bugfix and performance may get better or worse depending on use case.

If any consumers find this impacting you badly, please raise an issue with your use-case and we can add a flag to this (or improve in some other way).

@alecgibson alecgibson requested a review from dawidreedsy June 23, 2026 08:09
@coveralls

Copy link
Copy Markdown

Coverage Status

coverage: 97.467% (+0.002%) from 97.465% — fix/fetch-snapshot-by-timestamp-missing-ops into master

@alecgibson alecgibson merged commit 1066804 into master Jun 23, 2026
10 of 16 checks passed
@alecgibson alecgibson deleted the fix/fetch-snapshot-by-timestamp-missing-ops branch June 23, 2026 08:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants