improvement(mothership): user_table speed parity — limit bounds, keyset paging, background import/delete jobs#5012
Conversation
… + tenant-scoped query performance (#4915) * feat(tables): paginated background row-delete jobs via table_jobs * fix(tables): address review on async row-delete (filtered count, scoped optimistic clear, Cmd+A select-all, hide delete from tray) * improvement(tables): filter-aware select-all runs, delete-job read mask, keyset index + autovacuum tuning * feat(tables): run import/delete/export/backfill jobs on trigger.dev with in-process fallback * improvement(tables): raise delete page to 10k and export batch to 5k * improvement(tables): raise CSV import batch to 5k rows (param-cap bounded) * feat(tables): surface export jobs in the header tray with progress, cancel, and download * improvement(tables): surface exports as derived tables-scoped toasts instead of the import tray * Revert "improvement(tables): surface exports as derived tables-scoped toasts instead of the import tray" This reverts commit 1ea5871. * fix(tables): preserve export storage key (NoSuchKey) and unify jobs in one spinner tray * improvement(tables): jobs tray icon reflects aggregate state (spinner/check/alert) * fix(tables): restore jobs tray on the tables list (dropped in staging merge) * improvement(tables): keyset-paginate export row reads (offset paging was O(n^2) over large tables) * perf(tables): keyset pagination for grid infinite scroll Default-order row pages now cursor on (order_key, id) instead of OFFSET — each page is an index seek on tableOrderKeyIdx, where OFFSET re-scans and discards every prior row (O(N²) across deep scrolls and full drains like select-all/export-to-clipboard). Sorted views keep offset paging; the contract refines after+sort as mutually exclusive. v1 public rows API is unchanged (extends the unrefined base, omits after). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(tables): show export in job tray immediately on kickoff The export-jobs query's poll only self-sustains once a running job is already in the cache, so a freshly kicked export stayed invisible until an SSE event or page refresh. Invalidate the tray query on kickoff success so the icon appears right away. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(tables): surface real row/column write errors in toasts Drizzle wraps DB errors in DrizzleQueryError whose message is the failed SQL — the real cause (e.g. the row-limit trigger's RAISE) sits on .cause, so the routes' substring classification never matched and everything fell through to generic 500s ("Failed to insert row"). Add rootErrorMessage (cause-chain unwrap) and a shared rowWriteErrorResponse classifier that consolidates the per-route pattern lists and rewrites the trigger message into a friendly "Row limit exceeded — capped at N rows". Applied across the app and v1 row-write routes and the columns route. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * perf(tables): tenant-bound filtered row counts (12.7s -> 0.6s) JSONB filter predicates (->> ILIKE / range casts) are opaque to the planner: it estimates a handful of matches and picks a parallel seq scan over the entire shared user_table_rows relation — every tenant's rows — for the page-0 COUNT(*), so any non-equality filter on a large table cost 10s+ regardless of how few rows matched. Run filtered counts in a transaction with SET LOCAL enable_seqscan = off, forcing the tenant-bounded bitmap plan. Unfiltered counts keep their index-only scan. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * perf(tables): tenant-bound Cmd+F search and stream its window (75s -> 2s) Same planner trap as the filtered count, compounded: the lateral jsonb_each_text ILIKE is unestimatable, so findRowMatches on a 1M-row table seq-scanned the whole 12M-row shared relation and disk-sorted ~120MB of window input (75s measured). SET LOCAL enable_seqscan=off bounds the scan to the tenant; on the default order, additionally penalizing bitmap/sort/parallel steers the planner onto the already- sorted (table_id, order_key, id) index walk so row_number() streams with no sort at all (2s measured). Flags only penalize plan shapes — a custom sort still sorts. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * perf(tables): tenant-bound sorted pages and filtered write selections Extends the seqscan fix to every remaining jsonb-predicate path, all measured on a 1M-row table in a 12M-row shared relation: - sorted page query (ORDER BY data->>'col'): 9.7s/page -> 0.76s, and deep pages stop spilling ~130MB sorts to disk - updateRowsByFilter / deleteRowsByFilter row selection: 14.4s -> bounded - delete-job worker selectRowIdPage with a filter: 12.6s/page -> bounded - dispatcher filtered-scope window walk: same shape, same fix Shared withSeqscanOff helper moves to lib/table/planner.ts (service + dispatcher both consume it; dispatcher can't import service). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * perf(tables): tenant-scoped containment index (migration 0232) The plain GIN on user_table_rows.data matched @> candidates across every tenant sharing the relation — a hot value in someone else's table inflated everyone's equality filters (1.07M candidates fetched for a 33k-row match, lossy bitmap, 1.1s). Replace it with btree_gin (table_id, data jsonb_path_ops): the tenant intersection happens inside the index and paths are single hashed entries. Rare-equality probe 326ms -> 17ms with zero wasted candidates; unique-constraint checks and upsert conflict lookups ride the same index. The new index is smaller than the one it replaces (529MB vs 694MB on the 12M-row dev relation). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * perf(tables): tenant-bound unique-constraint checks (3.5s -> <1s per write) The unique check runs lower(data->>'col') = $1 LIMIT 1 on every insert and cell edit touching a unique column. The predicate is unestimatable and a unique (non-conflicting) value never exits early, so the planner seq-scanned all 12.3M shared-relation rows per check — 3.5s measured. Tenant-bound both the single and batch variants; the batch path sets the flag on the caller's transaction when one is supplied (SET LOCAL dies at its commit, and the statements that follow are tenant-scoped writes). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * perf(tables): tenant-bound upsert conflict lookup Same unestimatable data->>key predicate as the unique checks; an insert-path upsert has no existing match so the lookup can't exit early and seq-scans the whole shared relation. The upsert already runs in a transaction — set the planner flag on it. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(tables): consolidate executor types onto planner exports service.ts kept a private DbTransaction alias and two inline typeof db | DbTransaction unions after planner.ts began exporting the canonical DbTransaction/DbExecutor — import those instead. From the /simplify review of the perf series; no behavior change. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(tests): drop narrow schema mock override in process-contents test The local vi.mock('@sim/db/schema') stubbed only document/knowledgeBase, but the file's import graph reaches lib/table/service whose module scope now references tableJobs. The global schema mock already covers all of it — rely on it per the testing rules instead of re-mocking. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(tables): scope cancels and counts to the filtered selection (review) Addresses the open Bugbot/Greptile findings on filtered select-all: - Filtered runs no longer cancel the whole table: cancelWorkflowGroupRuns takes a filter — it stops only dispatches with that exact filter scope and only in-flight cells on matching rows (semi-join); whole-table and differently-scoped dispatches keep running, their cancelled cells skipped via cancelledAt > requestedAt. - Stop on a filtered select-all sends the filter through cancel-runs (contract + route + mutation) instead of a table-wide cancel. - runColumnBodySchema rejects rowIds + filter together (mirrors deleteTableRowsBodySchema). - Select-all delete clears the selection in onSuccess, not at click, so a failed kickoff restores both rows and selection. - Clipboard copy/cut estimates use the filter-aware total (rowTotal) instead of the whole-table rowCount. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * chore: retrigger CI (Actions dropped the previous push events) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * chore: bump api-validation route baseline to 807 (staging route + merge) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(tables): release job claim when trigger.dev dispatch fails If tasks.trigger (or its dynamic imports) throws after markTableJobRunning, the ghost running row held the table's one-write-job slot until the stale-job janitor fired (~15-20 min of 409s). All four kickoff routes now release the claim and rethrow; the backfill runner releases and warns (a failed backfill never fails the schema change). Greptile P1. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * chore(db): squash 0232 into 0231 (one migration for the PR) Both are branch-only — no environment has applied them through the migration ledger yet — so the tenant-scoped GIN (btree_gin extension, index swap) folds into 0231_table_jobs_and_keyset. Snapshot chain re-pointed; drizzle-kit generate confirms zero drift. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(tables): context-menu delete label shows the true select-all count Under select-all the context menu counted only the loaded page ("Delete 1000 rows" on a 999k-row table) while the action correctly deletes every matching row via the background job. Delete now gets its own count from the filter-aware total minus deselections; the run-action labels keep the loaded-row count since those actions act on loaded rows only. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(tables): context-menu bulk actions act on the full select-all scope Follow-up to the label fix: under select-all the context menu's Run / Re-run / Stop only acted on the loaded page of rows. They now route through the same scopes as the action bar — runs dispatch by filter (whole table when unfiltered), Stop uses the filter-scoped cancel — and all labels share one true count (filter-aware total minus deselections, locale-formatted). Like the action bar, filter-scoped runs ignore deselections (the run API has no exclusion set). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * feat(tables): exclusion set for select-all runs and stops Select-all minus deselected rows now means exactly that for every bulk action, not just delete. runColumnBodySchema and cancelTableRunsBodySchema accept excludeRowIds (bounded by MAX_EXCLUDE_ROW_IDS, select-all scope only); the dispatch scope persists it and the dispatcher window walk, eager bulk-clear, pre-run cancel, and filter/table-scoped cancel all skip excluded rows. Client threads exclusions from the selection through the action bar and the grid context menu, including the optimistic stamps. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(tables): spare excluded-row dispatches on Stop; no orphan placeholder table Two Bugbot findings on the exclusion work: - Select-all-minus-deselections Stop (no filter) cancelled every active dispatch table-wide, killing row-scoped runs on deselected rows. markActiveDispatchesCancelled now spares dispatches whose scope.rowIds are fully contained in the exclusion set (coalesce(false) keeps table-wide dispatches cancellable). - Create-mode import: a failed trigger.dev dispatch released the job claim but left the just-created placeholder table in the workspace. Archive it on the failure path (no hard-delete surface exists). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(tables): row counts reflect a running delete everywhere A mid-delete refresh resurrected the old counts: the optimistic update stripped cached rows but left page-0 totalCount (footer / select-all label) at the old total, and list/detail counts reported raw row_count including doomed-but-not-yet-deleted rows. - onMutate now sets the active view's totalCount to the kept rows and decrements the cached detail rowCount by the doomed estimate - the kickoff persists that estimate on the job (payload.doomedCount, clamped server-side); getTableById/listTables subtract the not-yet-deleted remainder (doomedCount - rows_processed) while the delete runs, so refetched counts match the read path's delete mask Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * copy(tables): drop background mention from delete confirm Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(tables): clear select-all immediately when a delete kicks off The header checkbox lingered as a minus over the optimistically-emptied grid: rowSelectionCoversAll treats zero rows as not-covered, and the selection clear waited for the kickoff's onSuccess. Clear at click (failed kickoffs visibly restore rows + toast; re-selecting is cheap) and render an empty grid's header checkbox unchecked regardless — a selection over zero rows is vacuous. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(tables): export takes the async path while a delete job runs The sync/async export choice reads rowCount, which is a doomed-estimate- adjusted number during a running delete (and the estimate is client- supplied) — an overstated estimate could route a still-large masked set through the synchronous stream. Mid-delete exports now always run as a job: safe at any size, and exports bypass the one-job-per-table gate. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(build): stop uploads setup from sweeping the project into route graphs next build (Turbopack) failed with "Two or more assets with different content were emitted to the same output path" on the server-root chunk. Root cause: setup.server.ts's unscoped path.resolve(process.cwd()) made node-file-tracing sweep the entire project — next.config.ts included — into every route graph reaching lib/uploads (the files/upload route and, since the export job, the export-async path). Two producers emitted the swept config into same-named chunks; staging's latest commits made their contents diverge and the names collided. Annotate the path derivation with turbopackIgnore per the NFT warning's own remediation — the build passes and all ~390 "unexpected file in NFT list" warnings disappear. Also inline the releaseJobClaim dynamic imports in the kickoff routes to plain static imports — service is already statically imported there. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
…post-index ANALYZE guard (#4997) * fix(tables): per-batch delete-job commits, real trigger.dev retries, post-index ANALYZE guard * fix(tables): resume job progress across retries, rethrow root cause for clean failure messages
… up tables feature (#4995) * improvement(tables): migrate inputs to emcn chip components and clean up tables feature * fix(tables): address review feedback — stale filter column label, shared FieldError in enrichment config * improvement(tables): scope create-table callback to stable mutateAsync
…me/drop prompt `drizzle-kit push --force` only suppresses the data-loss confirm, not the rename-vs-drop disambiguation prompt. That prompt fires whenever a diff both adds and drops tables/columns at once (e.g. migration 0231 created sim_trigger_state while dropping the workspace_notification_* tables), and in CI it crashes with a bare "Interactive prompts require a TTY" stack trace. Catch that specific failure in the dev push step and emit a GitHub error annotation explaining the cause and the fix (drop the stale objects on the dev DB to match schema.ts — the same DROPs the versioned migration already applied to staging/prod), instead of leaving an opaque trace. Exit status is preserved either way. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…nds, background import/delete jobs Mothership's table operations lagged the fast paths the tables feature already has: - function_execute mounted inputTables via queryRows with defaults, silently truncating the sandbox CSV to 100 rows and paying for a count and execution metadata it never used. Mounts now drain the keyset export reader page by page, remap stored column-id keys to display names (headers previously didn't match id-keyed cell data), mount tables in parallel, and count toward the 50MB sandbox mount budget. - user_table query_rows accepted unbounded limits (whole table into one tool result) and only offset paging. It now enforces the contracts' MAX_QUERY_LIMIT, skips execution metadata, accepts the keyset `after` cursor, and returns `nextCursor` when a default-order page fills. - import_file / create_from_file / delete_rows_by_filter mutated without claiming the per-table job slot, racing running background jobs, and ran whole imports inline in the chat request. CSV/TSV imports ≥8MB and unbounded filter-deletes matching >1000 rows now dispatch the same trigger.dev jobs the UI routes use (slot claim, release-on-dispatch- failure, delete mask); inline imports claim and release the slot. A new get_job operation surfaces the table's derived job state for polling. - TableImportPayload grows deleteSourceFile (default true) so Mothership imports of persistent workspace files survive the worker's single-use source cleanup. Generated tool catalog artifacts regenerated from simstudioai/copilot#289. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
@greptile review |
|
The latest updates on your projects. Learn more about Vercel for GitHub. |
PR SummaryHigh Risk Overview Bulk workflow APIs accept Grid UX treats select-all as “all rows matching the filter” with an exclusion set, kicks off async delete instead of loading every id, and routes Play/Refresh/Stop/context-menu actions through filter-scoped dispatch when appropriate. Exports above a threshold (or during a running delete) go async; SSE CI: dev Reviewed by Cursor Bugbot for commit 0e47d6e. Bugbot is set up for automated code reviews on this repo. Configure here. |
| sort: validated.sort ? wire.sortIn(validated.sort) : undefined, | ||
| limit: validated.limit, | ||
| offset: validated.offset, | ||
| after: validated.after, |
There was a problem hiding this comment.
HTTP rows missing next cursor
Medium Severity
The GET rows handler now accepts an after keyset cursor and passes it to queryRows, but the JSON response never includes a nextCursor when a full page is returned. REST clients cannot advance past the first default-order page without guessing offsets or reusing Mothership-only tooling.
Reviewed by Cursor Bugbot for commit 7310c06. Configure here.
Greptile SummaryThis PR migrates Mothership's
Confidence Score: 5/5Safe to merge. The ownership gate (partial-unique index + The three core changes — limit enforcement, keyset pagination, and async job dispatch — are all narrowly scoped and well-tested. The null- The keyset cursor logic in Important Files Changed
Sequence Diagram%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
participant M as Mothership Model
participant UT as user-table tool
participant SVC as table/service
participant TJ as table_jobs (DB)
participant W as Background Worker
Note over M,W: Large CSV import via import_file
M->>UT: import_file(fileId, tableId)
UT->>SVC: markTableJobRunning(tableId, importId, 'import')
SVC->>TJ: INSERT job row (partial-unique index gate)
TJ-->>SVC: "claimed=true"
SVC-->>UT: true
UT->>W: dispatchImportJob(payload)
UT-->>M: success + jobId
M->>UT: "query_rows(tableId, after=cursor)"
UT->>SVC: queryRows + pendingDeleteMask check
SVC-->>UT: rows + nextCursor
UT-->>M: rows + nextCursor
W->>SVC: markJobReady(tableId, importId)
Note over M,W: Unbounded filter-delete via delete_rows_by_filter
M->>UT: delete_rows_by_filter(tableId, filter)
UT->>SVC: "queryRows count — totalCount > 1000"
UT->>SVC: markTableJobRunning(tableId, jobId, 'delete', payload)
SVC->>TJ: INSERT job row with filter+cutoff payload
UT->>W: dispatchDeleteJob(jobId, filter, cutoff)
UT-->>M: success + jobId, rows hidden immediately
W->>SVC: selectRowIdPage keyset by id
W->>SVC: deletePageByIds per-batch transactions
W->>SVC: markJobReady — mask lifts
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
participant M as Mothership Model
participant UT as user-table tool
participant SVC as table/service
participant TJ as table_jobs (DB)
participant W as Background Worker
Note over M,W: Large CSV import via import_file
M->>UT: import_file(fileId, tableId)
UT->>SVC: markTableJobRunning(tableId, importId, 'import')
SVC->>TJ: INSERT job row (partial-unique index gate)
TJ-->>SVC: "claimed=true"
SVC-->>UT: true
UT->>W: dispatchImportJob(payload)
UT-->>M: success + jobId
M->>UT: "query_rows(tableId, after=cursor)"
UT->>SVC: queryRows + pendingDeleteMask check
SVC-->>UT: rows + nextCursor
UT-->>M: rows + nextCursor
W->>SVC: markJobReady(tableId, importId)
Note over M,W: Unbounded filter-delete via delete_rows_by_filter
M->>UT: delete_rows_by_filter(tableId, filter)
UT->>SVC: "queryRows count — totalCount > 1000"
UT->>SVC: markTableJobRunning(tableId, jobId, 'delete', payload)
SVC->>TJ: INSERT job row with filter+cutoff payload
UT->>W: dispatchDeleteJob(jobId, filter, cutoff)
UT-->>M: success + jobId, rows hidden immediately
W->>SVC: selectRowIdPage keyset by id
W->>SVC: deletePageByIds per-batch transactions
W->>SVC: markJobReady — mask lifts
Reviews (4): Last reviewed commit: "improvement(mothership): drop get_job op..." | Re-trigger Greptile |
| const tableMounts = await Promise.all( | ||
| inputTables.map(async (tableRef) => { | ||
| const tableId = | ||
| typeof tableRef === 'string' | ||
| ? tableRef | ||
| : tableRef && typeof tableRef === 'object' | ||
| ? (tableRef as CanonicalTableInput).tableId || (tableRef as CanonicalTableInput).path | ||
| : undefined | ||
| if (!tableId) return null | ||
| const table = await resolveTableRef(tableId, tablePathLookup) | ||
| if (!table || table.workspaceId !== workspaceId) { | ||
| throw new Error( | ||
| `Input table not found: "${tableId}". Pass the table id (tbl_...) from tables/{name}/meta.json, or a tables/{name}/meta.json path.` | ||
| ) | ||
| } | ||
| const csvContent = await buildTableCsvForMount(table) | ||
| const sandboxPath = | ||
| typeof tableRef === 'object' && tableRef !== null | ||
| ? (tableRef as CanonicalTableInput).sandboxPath | ||
| : undefined | ||
| if (!tableId) continue | ||
| const table = await resolveTableRef(tableId, tablePathLookup) | ||
| if (!table || table.workspaceId !== workspaceId) { | ||
| throw new Error( | ||
| `Input table not found: "${tableId}". Pass the table id (tbl_...) from tables/{name}/meta.json, or a tables/{name}/meta.json path.` | ||
| ) | ||
| } | ||
| const rows = await queryRows(table, {}, 'copilot-fn-exec') | ||
|
|
||
| const allKeys = new Set(table.schema.columns.map((column) => column.name)) | ||
| for (const row of rows.rows ?? []) { | ||
| if (row.data && typeof row.data === 'object') { | ||
| for (const key of Object.keys(row.data as Record<string, unknown>)) { | ||
| allKeys.add(key) | ||
| } | ||
| return { | ||
| path: sandboxPath || `/home/user/tables/${table.id}.csv`, | ||
| content: csvContent, | ||
| } | ||
| } | ||
| const headers = Array.from(allKeys) | ||
| const csvLines = [headers.join(',')] | ||
| for (const row of rows.rows ?? []) { | ||
| const data = (row.data || {}) as Record<string, unknown> | ||
| csvLines.push( | ||
| headers | ||
| .map((h) => { | ||
| const val = data[h] | ||
| const str = val === null || val === undefined ? '' : String(val) | ||
| return str.includes(',') || str.includes('"') || str.includes('\n') | ||
| ? `"${str.replace(/"/g, '""')}"` | ||
| : str | ||
| }) | ||
| .join(',') | ||
| }) | ||
| ) | ||
| for (const mount of tableMounts) { | ||
| if (!mount) continue | ||
| if (totalSize + mount.content.length > MAX_TOTAL_SIZE) { | ||
| throw new Error( | ||
| `Mounting table "${mount.path}" would exceed the ${MAX_TOTAL_SIZE / 1024 / 1024}MB total mount limit. Mount fewer or smaller tables.` | ||
| ) | ||
| } | ||
| const csvContent = csvLines.join('\n') | ||
| const sandboxPath = | ||
| typeof tableRef === 'object' && tableRef !== null | ||
| ? (tableRef as CanonicalTableInput).sandboxPath | ||
| : undefined | ||
| sandboxFiles.push({ | ||
| path: sandboxPath || `/home/user/tables/${table.id}.csv`, | ||
| content: csvContent, | ||
| }) | ||
| totalSize += mount.content.length | ||
| sandboxFiles.push(mount) | ||
| } |
There was a problem hiding this comment.
Full table CSVs built before budget enforcement
All input tables are serialized in parallel via Promise.all before any size check runs. For each table the buildTableCsvForMount loop fetches every row page-by-page and accumulates the entire CSV in memory — no size gate inside the loop. The outer budget check (totalSize + mount.content.length > MAX_TOTAL_SIZE) only fires after every table has been fully loaded.
Concrete failure: a user mounts two enterprise-plan tables each at ~40MB of rows. Both CSVs are built (80MB in memory) before the loop sees that the second one exceeds the 50MB cap and throws. File and directory mounts check record.size before fetching; tables lack that pre-flight guard. Under traffic, several concurrent requests doing this can exhaust the web container's heap.
| if (args.limit === undefined) { | ||
| const { totalCount } = await queryRows( | ||
| table, | ||
| { filter: idFilter, limit: 1, withExecutions: false }, | ||
| requestId | ||
| ) | ||
| const matchCount = totalCount ?? 0 | ||
| if (matchCount > TABLE_LIMITS.MAX_BULK_OPERATION_SIZE) { | ||
| const doomedCount = Math.min(matchCount, table.rowCount) | ||
| const cutoff = new Date() | ||
| const jobId = generateId() | ||
| const payload: TableDeleteJobPayload = { | ||
| filter: idFilter, | ||
| cutoff: cutoff.toISOString(), | ||
| doomedCount, | ||
| } | ||
| assertNotAborted() | ||
| const claimed = await markTableJobRunning(table.id, jobId, 'delete', payload) | ||
| if (!claimed) { | ||
| return { success: false, message: 'A job is already in progress for this table' } | ||
| } | ||
| await dispatchDeleteJob({ | ||
| jobId, | ||
| tableId: table.id, | ||
| workspaceId, | ||
| filter: idFilter, | ||
| cutoff, | ||
| }) | ||
| return { | ||
| success: true, | ||
| message: `Started background delete of ${doomedCount} matching rows (job ${jobId}). The rows are hidden from reads immediately; call get_job to track progress.`, | ||
| data: { jobId, doomedCount }, | ||
| } | ||
| } |
There was a problem hiding this comment.
doomedCount can overestimate when rows are inserted between count and dispatch
matchCount comes from queryRows(..., { limit: 1, withExecutions: false }) and reflects the delete-mask-filtered row count. Between that count query and markTableJobRunning, rows inserted after the cutoff timestamp could match the filter, making matchCount stale. The cutoff passed to the background worker correctly caps new-row inclusion, but doomedCount stored in the job payload — and surfaced in the get_job response message — would over-report the number of rows being removed. Not a correctness issue for the worker, but the user-visible estimate is an upper bound, not an exact figure.
Greptile SummaryThis PR closes three performance gaps in Mothership's table operations: sandbox table mounts now drain all rows via keyset pagination instead of silently truncating to 100,
Confidence Score: 3/5Safe to merge for most workloads; enterprise tables with hundreds of thousands of rows could exhaust Vercel function memory during a sandbox mount before the 50 MB budget check fires. The core job-slot migration, keyset pagination, background import/delete dispatch, and schema change are all well-structured and tested. The one concern worth resolving before a wide enterprise rollout is buildTableCsvForMount: all table CSVs are fully assembled in parallel memory before the budget check, so a user who mounts a 500k-row enterprise table gets a function OOM rather than a clean rejection. apps/sim/lib/copilot/tools/handlers/function-execute.ts — the buildTableCsvForMount loop and the parallel Promise.all mount assembly need a per-page budget check to avoid OOM on large enterprise tables. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Mothership: import_file / create_from_file] --> B{CSV/TSV 8 MB?}
B -- No --> C[markTableJobRunning claim slot]
C --> D[Inline import parseFileRows + batchInsert]
D --> E[releaseJobClaim]
B -- Yes --> F[createTable with jobStatus=running or markTableJobRunning]
F --> G{isTriggerDevEnabled?}
G -- Yes --> H[tasks.trigger tableImportTask]
G -- No --> I[runDetached runTableImport]
H -- dispatch error --> J[releaseJobClaim / deleteTable]
I --> K[runTableImport streaming keyset import]
H --> K
L[Mothership: delete_rows_by_filter] --> M{limit === undefined?}
M -- Yes --> N[queryRows count with pendingDeleteMask]
N --> O{matchCount > 1000?}
O -- No --> P[deleteRowsByFilter inline]
O -- Yes --> Q[markTableJobRunning claim]
Q --> R[dispatchDeleteJob]
R --> S[runTableDelete keyset page walk]
M -- No --> P
T[Mothership: query_rows] --> U{limit > 1000?}
U -- Yes --> V[return error]
U -- No --> W[queryRows with pendingDeleteMask + after cursor]
W --> X[return rows + nextCursor]
Y[Mothership: get_job] --> Z[getTableById latestJobForTable]
Z --> AA[return job status / rowCount]
Reviews (2): Last reviewed commit: "improvement(mothership): table speed par..." | Re-trigger Greptile |
| * would corrupt values. | ||
| */ | ||
| function formatMountCsvValue(value: unknown): string { | ||
| if (value === null || value === undefined) return '' | ||
| if (value instanceof Date) return value.toISOString() | ||
| if (typeof value === 'object') return JSON.stringify(value) | ||
| return String(value) | ||
| } | ||
|
|
||
| /** | ||
| * Serializes a full table to CSV for a sandbox mount. Walks the keyset export | ||
| * reader page by page so every row is included (`queryRows` with defaults | ||
| * silently truncated mounts to its 100-row page and paid for a count and | ||
| * execution metadata the CSV never used), and remaps stored column-id keys | ||
| * back to display names so headers line up with cell values. | ||
| */ | ||
| async function buildTableCsvForMount(table: TableDefinition): Promise<string> { | ||
| const nameById = buildNameById(table.schema) | ||
| const headers = table.schema.columns.map((c) => c.name) | ||
| const lines = [toCsvRow(headers)] | ||
| let after: { position: number; id: string } | null = null | ||
| while (true) { | ||
| const page = await selectExportRowPage(table, after, TABLE_MOUNT_PAGE_SIZE) | ||
| for (const row of page) { | ||
| const data = rowDataIdToName(row.data, nameById) | ||
| lines.push(toCsvRow(headers.map((header) => formatMountCsvValue(data[header])))) | ||
| } | ||
| if (page.length < TABLE_MOUNT_PAGE_SIZE) return lines.join('\n') | ||
| const last = page[page.length - 1] | ||
| after = { position: last.position, id: last.id } | ||
| } | ||
| } | ||
|
|
||
| async function resolveInputFiles( | ||
| workspaceId: string, | ||
| inputFiles?: unknown[], |
There was a problem hiding this comment.
Full CSV materialized in memory before the 50 MB budget check
buildTableCsvForMount drains every row into a lines[] array and only returns after the full table is assembled. The budget check in the caller happens after all table CSVs are built via Promise.all. For an enterprise table with 100k+ rows, the CSV string can be hundreds of megabytes or more before the check ever fires — the budget guard prevents the file from reaching the sandbox, but the memory damage is already done (and in the parallel case, all tables are materialized simultaneously).
The old queryRows path implicitly bounded this to 100 rows. A straightforward fix is to track a running byte count inside the drain loop and throw early once the caller's budget would be exceeded — or pass maxBytes as a parameter and abort the page walk.
| * would corrupt values. | ||
| */ | ||
| function formatMountCsvValue(value: unknown): string { | ||
| if (value === null || value === undefined) return '' | ||
| if (value instanceof Date) return value.toISOString() | ||
| if (typeof value === 'object') return JSON.stringify(value) | ||
| return String(value) | ||
| } | ||
|
|
||
| /** | ||
| * Serializes a full table to CSV for a sandbox mount. Walks the keyset export | ||
| * reader page by page so every row is included (`queryRows` with defaults | ||
| * silently truncated mounts to its 100-row page and paid for a count and | ||
| * execution metadata the CSV never used), and remaps stored column-id keys | ||
| * back to display names so headers line up with cell values. | ||
| */ | ||
| async function buildTableCsvForMount(table: TableDefinition): Promise<string> { | ||
| const nameById = buildNameById(table.schema) | ||
| const headers = table.schema.columns.map((c) => c.name) | ||
| const lines = [toCsvRow(headers)] | ||
| let after: { position: number; id: string } | null = null | ||
| while (true) { | ||
| const page = await selectExportRowPage(table, after, TABLE_MOUNT_PAGE_SIZE) | ||
| for (const row of page) { | ||
| const data = rowDataIdToName(row.data, nameById) | ||
| lines.push(toCsvRow(headers.map((header) => formatMountCsvValue(data[header])))) | ||
| } | ||
| if (page.length < TABLE_MOUNT_PAGE_SIZE) return lines.join('\n') | ||
| const last = page[page.length - 1] | ||
| after = { position: last.position, id: last.id } | ||
| } | ||
| } | ||
|
|
||
| async function resolveInputFiles( | ||
| workspaceId: string, | ||
| inputFiles?: unknown[], |
There was a problem hiding this comment.
pendingDeleteMask fetched on every page — N+1 DB round-trips during drain
selectExportRowPage calls pendingDeleteMask(table) unconditionally, so a 100k-row table drained in 20 pages makes 20 extra DB round-trips just to check whether a delete job is running. The mask is derived from table.id which is constant, and the delete job status won't change meaningfully mid-drain. Fetching it once before the loop and threading it through would eliminate the repeated queries.
…er to follow-up The keyset CSV drain for inputTables fixed the 100-row truncation but introduced an unbounded in-memory materialization: a large table (e.g. enterprise 1M-row) is drained whole into the web-container heap before the 50MB mount check runs, risking OOM. Reverting to main's behavior (100-row mount) for now; the proper fix (incremental byte-bound + filter/limit/columns selection, or by-reference signed-URL fetch) is deferred to a separate effort so this PR can ship the user_table speed work (C/D/E) without the memory regression. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
@greptile review |
| const nextCursor = | ||
| !args.sort && result.rows.length === result.limit && lastRow?.orderKey | ||
| ? { orderKey: lastRow.orderKey, id: lastRow.id } | ||
| : undefined |
There was a problem hiding this comment.
nextCursor requires truthy orderKey
Low Severity
query_rows only returns nextCursor when the last row has a truthy orderKey. A full page whose last row lacks order_key (e.g. not yet backfilled) omits the cursor even though more rows may exist.
Reviewed by Cursor Bugbot for commit c1c4160. Configure here.
…s via query_rows Per review, get_job is unnecessary: the model observes background import/delete progress with query_rows (rows appearing as an import lands; the delete mask already makes query_rows reflect the post-delete view). Removes the operation and repoints the dispatch messages at query_rows. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
@greptile review |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
There are 4 total unresolved issues (including 2 from previous reviews).
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 0e47d6e. Configure here.
| if (excluded.has(targetId)) excluded.delete(targetId) | ||
| else excluded.add(targetId) | ||
| return { kind: 'all', excluded: excluded.size === 0 ? undefined : excluded } | ||
| } |
There was a problem hiding this comment.
Shift-click shrinks select-all scope
Medium Severity
After a filter-wide select-all (rowSelection.kind === 'all'), shift-clicking row checkboxes only updates the loaded page: the handler materializes selection to in-memory row ids instead of keeping the exclusion-set model. Unloaded matching rows drop out of delete, run, and stop actions that rely on select-all semantics.
Reviewed by Cursor Bugbot for commit 0e47d6e. Configure here.
| onStopAllRows(queryOptions.filter ?? undefined, excluded) | ||
| } else { | ||
| onStopRows(contextMenuRowIds) | ||
| } |
There was a problem hiding this comment.
Stop count ignores select-all scope
Low Severity
For a filter-wide select-all, the context menu stop action calls onStopAllRows with the filter, but runningInContextSelection still sums runningByRowId only over contextMenuRowIds (loaded rows). The Stop item can stay hidden or show the wrong count when running workflows sit on unloaded matching rows.
Reviewed by Cursor Bugbot for commit 0e47d6e. Configure here.


Context
Mothership's
user_tabletool lagged the fast/safe paths the tables feature already has. This PR lands three of the four findings from the audit; the fourth (thefunction_executelarge-table CSV mount) is deferred — see note at the bottom.Changes
C — limit bounds.
query_rowstook an unboundedlimit(whole table into one tool result) and the filter ops took unbounded limits (every matching row's id+data into memory). Now rejected aboveMAX_QUERY_LIMIT(1000) /MAX_BULK_OPERATION_SIZE(1000) with the contract's message text.D —
query_rowswaste + paging. Dropped the always-on executions-metadata load (withExecutions: false), and added keyset pagination: accepts anaftercursor, returnsnextCursorwhen a default-order page fills (rejectsafter+sort). Offset paging was O(n²) walking a big table.E — async job parity for imports/deletes.
import_file/create_from_file(CSV/TSV ≥8MB) anddelete_rows_by_filter(>1000 matches, no explicit limit) now dispatch the same trigger.dev jobs the UI routes use, claiming the per-table job slot so they can't race a running background job. Smaller imports stay inline but claim/release the slot. The model polls progress viaquery_rows(rows appear as an import lands; the delete mask makesquery_rowsreflect the post-delete view immediately) — no dedicated job-status op.TableImportPayload.deleteSourceFileflag (default true) so Mothership imports of persistent workspace files don't delete the source.Tests
user-table.test.ts(limit clamps, keysetafter/nextCursor,after+sortrejection, slot claim/release, async dispatch payloads) andimport-runner.test.ts(source-file cleanup default vsdeleteSourceFile: false).bun run lint:check,type-check, andcheck:api-validationgreen.Deferred (not in this PR)
filter/limit/columnsselection, or by-reference signed-URL fetch — is a separate effort.Model-facing docs
Catalog change documenting
after/ limit maxes / background behavior is in simstudioai/copilot#312.🤖 Generated with Claude Code