Skip to content

chore: allow optional whitespace around parameters similar to other language implementations#3530

Open
sungwy wants to merge 1 commit into
mainfrom
decimal-whitespace
Open

chore: allow optional whitespace around parameters similar to other language implementations#3530
sungwy wants to merge 1 commit into
mainfrom
decimal-whitespace

Conversation

@sungwy

@sungwy sungwy commented Jun 19, 2026

Copy link
Copy Markdown
Collaborator

Reference: apache/iceberg#16798

Rationale for this change

PyIceberg is a bit more strict in its parsing than the other language implementations. This allows json string representations like decimal( 9, 2 ) like the other libraries.

Are these changes tested?

Unit tested

Are there any user-facing changes?

Yes, more permissive metadata parsing by PyIceberg clients.

Copilot AI review requested due to automatic review settings June 19, 2026 18:56
@sungwy sungwy changed the title allow optional whitespace around parameters similar to other language implementations chore: allow optional whitespace around parameters similar to other language implementations Jun 19, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR makes PyIceberg’s decimal type parsing more permissive by allowing optional whitespace around decimal parameters, aligning behavior with other Iceberg language implementations.

Changes:

  • Update the decimal type parsing regex to tolerate whitespace around precision/scale and separators.
  • Add unit tests covering multiple whitespace variants for decimal(P,S) JSON deserialization.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
tests/test_types.py Adds parametrized tests to ensure DecimalType deserialization accepts optional whitespace variants.
pyiceberg/types.py Relaxes the decimal parsing regex to allow whitespace around precision/scale and punctuation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pyiceberg/types.py
from pyiceberg.utils.singleton import Singleton

DECIMAL_REGEX = re.compile(r"decimal\((\d+),\s*(\d+)\)")
DECIMAL_REGEX = re.compile(r"decimal\(\s*(\d+)\s*,\s*(\d+)\s*\)")
Comment thread pyiceberg/types.py
DECIMAL_REGEX = re.compile(r"decimal\((\d+),\s*(\d+)\)")
DECIMAL_REGEX = re.compile(r"decimal\(\s*(\d+)\s*,\s*(\d+)\s*\)")
FIXED = "fixed"
FIXED_PARSER = ParseNumberFromBrackets(FIXED)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we align this logic to the other regex too?

https://github.com/apache/iceberg/blob/41c8ee43b46017843b304241d250737fd10837af/api/src/main/java/org/apache/iceberg/types/Types.java#L66-L73

  private static final Pattern FIXED = Pattern.compile("fixed\\[\\s*(\\d+)\\s*\\]");
  private static final Pattern GEOMETRY_PARAMETERS =
      Pattern.compile("geometry\\s*(?:\\(\\s*([^)]*?)\\s*\\))?", Pattern.CASE_INSENSITIVE);
  private static final Pattern GEOGRAPHY_PARAMETERS =
      Pattern.compile(
          "geography\\s*(?:\\(\\s*([^,]*?)\\s*(?:,\\s*(\\w*)\\s*)?\\))?", Pattern.CASE_INSENSITIVE);
  private static final Pattern DECIMAL =
      Pattern.compile("decimal\\(\\s*(\\d+)\\s*,\\s*(\\d+)\\s*\\)");

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants