Skip to content

fix: raise ValueError when external_id is used for NDJSON annotation import#2059

Open
sowndappan5 wants to merge 2 commits into
Labelbox:developfrom
sowndappan5:ss/fix-external-id-validation
Open

fix: raise ValueError when external_id is used for NDJSON annotation import#2059
sowndappan5 wants to merge 2 commits into
Labelbox:developfrom
sowndappan5:ss/fix-external-id-validation

Conversation

@sowndappan5

@sowndappan5 sowndappan5 commented Jun 30, 2026

Copy link
Copy Markdown

Description

This PR fixes a client-side validation error when external_id is used inside a Label data container during NDJSON serialization (used for annotation/prediction imports).

Previously, if a developer created a Label using only external_id, the NDJSON serializer tried to build a DataRow with both id and global_key as None, causing Pydantic to throw a generic ValidationError: Must set either id or global_key.

Because the Labelbox annotation import (MAL) API does not natively support referencing data rows using externalId, we added a descriptive validation check that raises a helpful ValueError instructing developers to use global_key or resolve their external_id to a data_row_id before uploading.

Fixes #1941

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Document change (fix typo or modifying any markdown files, code comments or anything in the examples folder only)

All Submissions

  • Have you followed the guidelines in our Contributing document?
  • Have you provided a description?
  • Are your changes properly formatted?

New Feature Submissions

  • Does your submission pass tests?
  • Have you added thorough tests for your new feature?
  • Have you commented your code, particularly in hard-to-understand areas?
  • Have you added a Docstring?

Changes to Core Features

  • Have you written new tests for your core changes, as applicable?
  • Have you successfully run tests with your changes locally?
  • Have you updated any code comments, as applicable?

Note

Low Risk
Adds pre-serialization validation and clearer errors only; valid uid/global_key import paths are unchanged.

Overview
NDJSON serialization (used for MAL/prediction and similar imports) now fails fast when a Label’s data row is referenced only via external_id, instead of surfacing a vague Pydantic DataRow validation error.

In NDJsonConverter.serialize, each label is checked before NDLabel.from_common: if label.data is present but has neither uid nor global_key, a ValueError is raised telling callers to use global_key or resolve external_id to a data row id.

A unit test asserts that MALPredictionImport.create_from_objects with data={"external_id": "..."} raises that message.

Reviewed by Cursor Bugbot for commit 23067a2. Bugbot is set up for automated code reviews on this repo. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Using external_id with Label container

1 participant