Skip to content

feat(integrations): hosted email-enrichment providers + cascade wiring#5087

Open
TheodoreSpeaks wants to merge 3 commits into
stagingfrom
feat/enrichment-providers
Open

feat(integrations): hosted email-enrichment providers + cascade wiring#5087
TheodoreSpeaks wants to merge 3 commits into
stagingfrom
feat/enrichment-providers

Conversation

@TheodoreSpeaks

Copy link
Copy Markdown
Collaborator

Summary

  • Add Datagma, Dropcontact, LeadMagic, Icypeas, and Enrow integrations — tools, blocks, brand icons, and BYOK + metered hosted-key support
  • Wire the new finders/verifiers into all five enrichment cascades: work-email, phone-number, email-verification, company-info, company-domain
  • Add hosting tests for all five providers plus cascade tests (new test files for email-verification, company-info, company-domain)

Type of Change

  • New feature

Testing

Tested manually. bun run lint, bun run check:api-validation:strict, and tsc --noEmit all pass; 161 unit tests pass (hosting + cascade + blocks). Live-API verification of per-credit pricing and provider response shapes still pending.

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

Add Datagma, Dropcontact, LeadMagic, Icypeas, and Enrow integrations —
tools, blocks, brand icons, and BYOK + metered hosted-key support — and
register each in the tool/block registries and BYOK provider list.

Wire the new finders/verifiers into the enrichment cascades:
- work-email: Datagma, LeadMagic, Dropcontact, Icypeas, Enrow
- phone-number: LeadMagic, Datagma, Dropcontact
- email-verification: Icypeas, Enrow
- company-info: Datagma, LeadMagic
- company-domain: Datagma

Add hosting tests for all five providers and cascade tests covering the
new providers (incl. new test files for email-verification, company-info,
and company-domain).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@vercel

vercel Bot commented Jun 16, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
docs Skipped Skipped Jun 16, 2026 5:24pm

Request Review

@cursor

cursor Bot commented Jun 16, 2026

Copy link
Copy Markdown

PR Summary

Medium Risk
Large additive surface area (external APIs, async polling, hosted billing rates) touches enrichment execution paths; misconfigured pricing or poll timeouts could affect cost or row outcomes, but core auth paths are unchanged.

Overview
Adds five B2B sales-enrichment providers — Datagma, Dropcontact, LeadMagic, Icypeas, and Enrow — as first-class workflow tools and workspace BYOK entries, with metered hosted-key billing (per-provider credit → USD helpers) and polling where APIs are async (Dropcontact, Enrow, Icypeas).

Each vendor gets canvas blocks (operation-specific sub-blocks, hosted API key hiding) plus registration in the block registry and new brand icons. The BYOK settings UI and byok-keys contract now accept the five provider IDs under the Enrichment group.

Enrichment waterfalls are extended so spreadsheet-style enrichments can fall through to the new backends: work-email and phone-number gain multiple finders/enrichers; email-verification adds Icypeas and Enrow after existing verifiers; company-info adds Datagma and LeadMagic; company-domain adds Datagma after PDL. Cascade behavior is covered by new/updated unit tests alongside hosting and polling tests for each provider.

Reviewed by Cursor Bugbot for commit e1ef8ef. Bugbot is set up for automated code reviews on this repo. Configure here.

Comment thread apps/sim/tools/icypeas/verify_email.ts
Comment thread apps/sim/tools/icypeas/verify_email.ts
@greptile-apps

greptile-apps Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR adds five new B2B data enrichment providers — Datagma, Dropcontact, LeadMagic, Icypeas, and Enrow — as tool implementations (with BYOK + hosted-key billing), UI entries in the BYOK settings panel, block definitions, and brand icons. The new providers are then wired into all five enrichment cascades (work-email, phone-number, email-verification, company-info, company-domain).

  • Tool layer: Each provider follows the existing async-poll pattern (submit → poll → map output), with per-endpoint credit billing based on the provider's documented pricing. Hosting configs, BYOK provider IDs, and the tool registry are all updated in sync.
  • Cascade wiring: New providers are appended to the end of each waterfall, with buildParams guards that skip a provider when required inputs are absent.
  • Critical mismatch in Icypeas tools: icypeas_verify_email and icypeas_find_email return success: false from postProcess for terminal non-found statuses. The cascade runner treats these as infrastructure errors, so Icypeas's definitive "invalid" verdicts are never surfaced to mapOutput, Enrow is always called unnecessarily after an Icypeas NOT_FOUND, and if Enrow also fails the cell returns blank instead of invalid.

Confidence Score: 3/5

The phone, company-info, and company-domain cascades are safe to merge. The email-verification cascade has a correctness gap where Icypeas NOT_FOUND verdicts are silently dropped and Enrow is charged unnecessarily on every invalid-email lookup.

The Icypeas tools return success:false from postProcess for terminal non-found statuses, which the cascade runner in enrichments/run.ts converts into a thrown error rather than a clean fall-through. For email-verification this means Icypeas definitive invalid verdicts are lost: Enrow is called on every email Icypeas flags as NOT_FOUND adding 0.25 credits per call, and when Enrow itself errors the cell shows blank instead of invalid. The cascade tests validate mapOutput for NOT_FOUND but that code path is unreachable in production.

apps/sim/tools/icypeas/verify_email.ts and apps/sim/tools/icypeas/find_email.ts — specifically the success flag returned from postProcess for terminal non-found statuses.

Important Files Changed

Filename Overview
apps/sim/tools/icypeas/verify_email.ts postProcess returns success:false for NOT_FOUND/DEBITED_NOT_FOUND, causing the cascade runner to treat definitive invalid verdicts as infrastructure errors, with potential for blank cells instead of invalid results.
apps/sim/tools/icypeas/find_email.ts Same success:false mismatch as verify_email for terminal non-found statuses; lower impact in work-email cascade because mapOutput would return null anyway, but error counting is inflated.
apps/sim/tools/datagma/find_email.ts Correct async GET tool; API key passed as URL query param (Datagma's required design) worth documenting the log-exposure risk for operators.
apps/sim/tools/dropcontact/enrich_contact.ts Well-structured async poll tool; billing charges 0 when email_found is false which aligns with documented Dropcontact pricing.
apps/sim/tools/enrow/verify_email.ts Correct async poll; always returns success:true; 0.25 credits charged per call regardless of result consistent with Enrow pricing.
apps/sim/enrichments/email-verification/email-verification.ts Cascade wiring and mapOutput logic are correct, but Icypeas success:false for NOT_FOUND means the cascade runner never calls mapOutput for invalid verdicts.
apps/sim/enrichments/work-email/work-email.ts Five new providers wired correctly; Icypeas success:false mismatch has no correctness impact here since NOT_FOUND mapOutput returns null anyway.
apps/sim/enrichments/company-info/company-info.ts Datagma and LeadMagic providers correctly added; employeeCount coercion to string handles range vs exact-count differences across providers.
apps/sim/lib/api/contracts/byok-keys.ts Five new provider IDs added to Zod enum, in sync with tools/types.ts BYOKProviderId union.
apps/sim/tools/datagma/hosting.ts Hosting config correct; pricing uses Popular-plan rate with appropriate caveats documented.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Email Verification Cascade] --> ZB[ZeroBounce]
    ZB -->|success:true unknown| NB[NeverBounce]
    ZB -->|mapOutput non-null| DONE[Result returned]
    NB -->|success:true unknown| MV[MillionVerifier]
    NB -->|mapOutput non-null| DONE
    MV -->|success:true unknown| ICY[Icypeas verify_email]
    MV -->|mapOutput non-null| DONE
    ICY -->|success:true FOUND| MAPOUT1[mapOutput valid]
    MAPOUT1 --> DONE
    ICY -->|success:false NOT_FOUND BUG| ERR[runner throws error]
    ERR --> ENROW[Enrow verify_email]
    ICY -.->|intended| MAPOUT2[mapOutput invalid never reached]
    MAPOUT2 -.-> DONE
    ENROW -->|qualification valid/invalid| MAPOUT3[result correct]
    MAPOUT3 --> DONE
    ENROW -->|throws| ERROUT[blank instead of invalid]
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
    A[Email Verification Cascade] --> ZB[ZeroBounce]
    ZB -->|success:true unknown| NB[NeverBounce]
    ZB -->|mapOutput non-null| DONE[Result returned]
    NB -->|success:true unknown| MV[MillionVerifier]
    NB -->|mapOutput non-null| DONE
    MV -->|success:true unknown| ICY[Icypeas verify_email]
    MV -->|mapOutput non-null| DONE
    ICY -->|success:true FOUND| MAPOUT1[mapOutput valid]
    MAPOUT1 --> DONE
    ICY -->|success:false NOT_FOUND BUG| ERR[runner throws error]
    ERR --> ENROW[Enrow verify_email]
    ICY -.->|intended| MAPOUT2[mapOutput invalid never reached]
    MAPOUT2 -.-> DONE
    ENROW -->|qualification valid/invalid| MAPOUT3[result correct]
    MAPOUT3 --> DONE
    ENROW -->|throws| ERROUT[blank instead of invalid]
Loading

Reviews (1): Last reviewed commit: "feat(integrations): hosted email-enrichm..." | Re-trigger Greptile

Comment on lines +156 to +162

if (status && TERMINAL_STATUSES.has(status)) {
return {
success: VALID_STATUSES.has(status),
output: mapItem(item),
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 success: false for terminal non-found statuses breaks the cascade

postProcess returns { success: false, output: mapItem(item) } for NOT_FOUND and DEBITED_NOT_FOUND. The enrichment cascade runner in enrichments/run.ts checks response.success first — if false, it looks for output.status === 404; otherwise it throws. Since the Icypeas string 'NOT_FOUND' never equals the number 404, the runner throws "icypeas_verify_email failed", counts Icypeas as an infrastructure error, and falls through to Enrow without ever calling the cascade's mapOutput.

Concrete failure: the cascade's mapOutput maps NOT_FOUND{ status: 'invalid', deliverable: false }, but that code path is unreachable. The waterfall always continues to Enrow; if Enrow also fails (rate-limit, timeout), the cell returns blank instead of invalid. Every other async-poll tool in this PR returns success: true from postProcess and lets mapOutput do the filtering — icypeas_verify_email and icypeas_find_email should do the same.

Comment on lines +163 to +169

if (status && TERMINAL_STATUSES.has(status)) {
return {
success: FOUND_STATUSES.has(status),
output: mapItem(item),
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Same success: false cascade-runner mismatch as in icypeas_verify_email

postProcess returns { success: false } for NOT_FOUND, BAD_INPUT, INSUFFICIENT_FUNDS, and ABORTED. The cascade runner treats any success: false with a non-404 output.status as a hard error. For work-email the end result is the same (mapOutput returns null for a null email either way), but the error counter is inflated. Return success: true for all terminal statuses, matching the pattern used by every other async-poll tool in this PR.

Comment on lines +65 to +73
const status = output.status as string | undefined
if (!status) {
throw new Error('Icypeas verify-email: cannot determine cost — status is missing')
}
// Billable when the status name contains DEBITED (i.e. DEBITED or DEBITED_NOT_FOUND).
const billable = status.includes('DEBITED')
// 0.1 credit; express as a fractional number so ICYPEAS_CREDIT_USD math works.
return billable ? 0.1 : 0
}),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Billing function throws when status is falsy

getCost throws 'cannot determine cost — status is missing' if output.status is falsy. In normal operation postProcess always returns a terminal status, so this is unreachable in practice. But if the hosting layer ever evaluates cost on the initial transformResponse result (before postProcess runs), output.status will be null and the throw propagates as an unhandled billing error. Consider returning 0 when status is absent, matching the defensive posture of the other providers.

Comment on lines +57 to +68
type: 'string',
required: true,
visibility: 'user-only',
description: 'Datagma API key',
},
},

request: {
url: (params) => {
const url = new URL('https://gateway.datagma.net/api/ingress/v6/findEmail')
url.searchParams.set('apiId', params.apiKey)
url.searchParams.set('fullName', params.fullName)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 API key embedded in URL query string (?apiId=...) for all Datagma endpoints

Every Datagma tool appends url.searchParams.set('apiId', params.apiKey). This is Datagma's documented auth scheme and can't be changed client-side, but the hosted API key appears verbatim in every request URL and will be captured by server-side access logs at Datagma and any intermediary. A note in the DATAGMA_API_KEY_PREFIX doc comment that this API uses URL-parameter auth would help operators understand the risk when rotating a compromised key.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment thread apps/sim/tools/icypeas/find_email.ts
Comment thread apps/sim/tools/datagma/enrich_person.ts
- Icypeas find_email/verify_email postProcess return success:true for all
  terminal statuses (NOT_FOUND/DEBITED_NOT_FOUND included) so the cascade
  runner calls mapOutput and records invalid/not-found verdicts instead of
  throwing and inflating the error count
- Bill Icypeas verify FOUND (not just DEBITED*) per the documented 0.1-credit
  charge
- Datagma enrich_person only applies the 30-credit phone surcharge when a
  phone lookup (phoneFull) was requested
- Note Datagma's URL-param (apiId) auth in the hosted-key doc comment
- Update hosting tests to match

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit e1ef8ef. Configure here.

hosting: enrowHosting<EnrowVerifyEmailParams>((_params, _output) => {
// 0.25 credits charged per verification call regardless of result
return 0.25
}),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enrow verify bills incomplete jobs

Medium Severity

Hosted pricing for enrow_verify_email always returns 0.25 credits and ignores the tool output. After a successful submit, executeTool still applies hosted cost when post-processing fails and falls back to the initial response, which has no qualification. Customers can be metered for a verification that never finished and the enrichment cascade gets no verdict.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit e1ef8ef. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant