Skip to content

feat: Geneva enterprise API V2 doc updates#274

Merged
rpgreen merged 13 commits into
mainfrom
rpgreen/enterprise-new
Jun 16, 2026
Merged

feat: Geneva enterprise API V2 doc updates#274
rpgreen merged 13 commits into
mainfrom
rpgreen/enterprise-new

Conversation

@rpgreen

@rpgreen rpgreen commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

@rpgreen rpgreen requested a review from jmhsieh June 15, 2026 14:47
@mintlify

mintlify Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
lancedb-bcbb4faf 🟢 Ready View Preview Jun 15, 2026, 2:51 PM

@rpgreen rpgreen requested a review from prrao87 June 15, 2026 16:32
@rpgreen rpgreen changed the title feat: Enterprise API V2 doc updates feat: Geneva enterprise API V2 doc updates Jun 15, 2026

@justinrmiller justinrmiller left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few comments to consider.

Comment thread docs/geneva/deployment/helm.mdx Outdated
Comment thread docs/geneva/index.mdx

@justinrmiller justinrmiller left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few comments to consider.


<Note>
Auto-backfill is an enterprise feature. On direct object-storage or local-filesystem
connections there is no managed agent, so `auto_backfill=True` has no effect and you must run

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not really a comment on this PR, but a question, do we alert the user that this has no effect when called on non-enterprise?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not. We don't really have a good way to detect it AFAIK. We could do it based on the connection URI, but it's possible that we might want to support creating UDFs directly against object storage even when there's an enterprise deployment.

Comment thread docs/geneva/getting-started.mdx Outdated
@rpgreen rpgreen enabled auto-merge (squash) June 16, 2026 17:32
@rpgreen rpgreen merged commit 23b51e2 into main Jun 16, 2026
2 checks passed
@rpgreen rpgreen deleted the rpgreen/enterprise-new branch June 16, 2026 17:33

@dantasse dantasse left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah shoot I'm too late, sorry - not 100% done but mostly

Comment on lines +122 to +139
## Providing a Ray cluster

The LanceDB Helm chart can be configured to deploy a static KubeRay cluster, provision KubeRay clusters on demand per job, or
use an existing Ray cluster.

### Use default LanceDB Enterprise Ray cluster (default)

By default, LanceDB Enterprise will use a shared, statically provisioned Ray cluster for job execution.

This can be enabled in the Helm chart by setting the following values.

```yaml
raycluster:
enabled: true

global:
rayclusterUri: "ray://raycluster-kuberay-head-svc.lancedb.svc.cluster.local:10001"
```

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm kind of confused how this interacts with the previous section. Like, "I just said what the default cluster was in the deployment-default bit above - now why do I have to set up raycluster: and global:?"

(I mean, I know this as an experienced user, but it still made me double take, which makes me think that it might be confusing to a new user.)

I guess the point to make is something like "geneva.defaults tells your jobs what cluster to use. But we assume you don't already have a Ray cluster, so you have to deploy one there; here's how you do that."

Comment on lines +145 to +150
Set `global.rayclusterUri` to an empty value to provision ephemeral KubeRay clusters on-demand for each execution job. The default KubeRay cluster configuration
is specified in `geneva.defaults.cluster`, i.e.

```yaml
geneva:
defaults:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not good at helm but I'd love it if this were super explicit (do you mean ""? or some other empty value?) so like this:

Suggested change
Set `global.rayclusterUri` to an empty value to provision ephemeral KubeRay clusters on-demand for each execution job. The default KubeRay cluster configuration
is specified in `geneva.defaults.cluster`, i.e.
```yaml
geneva:
defaults:
Set `global.rayclusterUri` to an empty value to provision ephemeral KubeRay clusters on-demand for each execution job. The default KubeRay cluster configuration
is specified in `geneva.defaults.cluster`, i.e.
```yaml
global:
rayclusterUri: ""
geneva:
defaults:

Comment on lines +34 to +41
<Note>
**Manifests are immutable at the column / view level.** When a transform is registered, its
manifest is snapshotted onto the column (or view) metadata. Changing the deployment-default
manifest — or the `GenevaManifest` object in your code — does **not** affect existing columns
or views: they keep using the snapshot taken at creation time. To move a column or view to a
new manifest, re-point it to a new (or updated) UDF / chunker / UDTF — for example with
`alter_columns()` for a column, or by recreating the view.
</Note>

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this, maybe after the @udtf(manifest=) section? I want to get the meat of the process down first (define manifest and attach it to udf) before I can understand the caveats around immutability of this registration.

Also, maybe spell out a little bit more, in code, what this means? so if I do:

@udf(manifest=manifest_a)
def myUdf(...

and run it, then tomorrow I change it to:

@udf(manifest=manifest_b)
def myUdf(...

and manifest_b has completely different dependencies, and I try to run it again, what happens? does it run with manifest_a or manifest_b? and what would the code look like if I want to change it to manifest_b?


When iterating locally, you often want the workers to run with the *exact* packages from your
current environment rather than a curated pip list. `Connection.capture_local_environment()`
zips your workspace (and, optionally, your site-packages), uploads the archives through the

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this is obvious but: does "workspace" mean "your current working directory, where you are running the script from"?


1. Install or upgrade the Geneva Helm chart (see [Helm Deployment](/geneva/deployment/helm/)).
2. Forward port 3000 from the geneva-console-ui service:
2. In your web browser, connect to the Geneva Console UI using the external ingress/load balancer URI configured in your deployment.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ooh exciting
hmm
are we doing this now?
we don't have any authentication on the console; that seems like a problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants