Skip to content

[CI] Add Java Format CI check for pull requests#2527

Open
Baunsgaard wants to merge 8 commits into
apache:mainfrom
Baunsgaard:format-check
Open

[CI] Add Java Format CI check for pull requests#2527
Baunsgaard wants to merge 8 commits into
apache:mainfrom
Baunsgaard:format-check

Conversation

@Baunsgaard

Copy link
Copy Markdown
Contributor

Add a workflow that enforces dev/CodeStyle_eclipse.xml on the Java files changed by a pull request. It applies the Eclipse formatter (via formatter-maven-plugin) to only the changed src/main/java and src/test/java files and fails if that produces any diff, printing the required changes.

Also add dev/format-changed.sh, a helper that runs the same scoped formatting locally on the files you changed relative to the base branch (plus staged/unstaged/untracked edits). This avoids a bare mvn formatter:format, which would reformat the entire tree.

The check is scoped to the PR diff rather than the whole tree because the existing sources are not yet fully formatter-clean; this lets the codebase converge file-by-file without failing unrelated PRs.

Add a workflow that enforces dev/CodeStyle_eclipse.xml on the Java files
changed by a pull request. It applies the Eclipse formatter (via
formatter-maven-plugin) to only the changed src/main/java and
src/test/java files and fails if that produces any diff, printing the
required changes.

Also add dev/format-changed.sh, a helper that runs the same scoped
formatting locally on the files you changed relative to the base branch
(plus staged/unstaged/untracked edits). This avoids a bare
`mvn formatter:format`, which would reformat the entire tree.

The check is scoped to the PR diff rather than the whole tree because the
existing sources are not yet fully formatter-clean; this lets the
codebase converge file-by-file without failing unrelated PRs.
Not for merge. Three scenarios to observe the check:
1. DMLRuntimeException.java: pre-existing file that is not formatter-clean,
   with a correctly-formatted method appended.
2. FormatProbeBroken.java: correctly-formatted file with one method that
   has blatant style violations.
3. FormatProbeClean.java: fully correctly-formatted new file.
@codecov

codecov Bot commented Jul 1, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 0% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 71.66%. Comparing base (c099a2f) to head (48b6eda).

Files with missing lines Patch % Lines
.../org/apache/sysds/runtime/DMLRuntimeException.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff            @@
##               main    #2527   +/-   ##
=========================================
  Coverage     71.66%   71.66%           
- Complexity    49338    49343    +5     
=========================================
  Files          1580     1580           
  Lines        190516   190517    +1     
  Branches      37364    37364           
=========================================
+ Hits         136525   136536   +11     
+ Misses        43464    43456    -8     
+ Partials      10527    10525    -2     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@janniklinde

Copy link
Copy Markdown
Contributor

I like this approach, but I'm wondering if it is too strict. It would imply that as a contributor, any slightly modified file must be completely reformatted by the PR. Second, different IDEs may apply slightly different formatting under the same xml so the only reliable way to reformat is using the exact same formatter.

@Baunsgaard

Baunsgaard commented Jul 2, 2026

Copy link
Copy Markdown
Contributor Author

I like this approach, but I'm wondering if it is too strict. It would imply that as a contributor, any slightly modified file must be completely reformatted by the PR. Second, different IDEs may apply slightly different formatting under the same xml so the only reliable way to reformat is using the exact same formatter.

Ya, the current solution is a bit too strict, i wanted to do something where we only fail if the there is a diff in conflict with the changes proposed, and then the CI job here would give a diff back that only edits those lines, not anything else in the file.

I will try some more to make it streamlined.

About different IDEs, I think it should just be consistent and one source of truth, this seems Reliable with maven and this proposed solution.

The Eclipse formatter only operates on whole files, so checking changed
files reformatted their pre-existing (not-yet-clean) lines and produced an
over-aggressive diff on lines the PR never touched.

Add dev/format_changed.py, which formats each changed file but keeps only
the formatting changes that fall on the lines the PR actually edited (via
git diff line ranges intersected with difflib opcodes). It supports:
  --check: fail only if an edited line is mis-formatted (used by CI)
  --fix:   apply formatting to only the edited-line regions

Wire javaFormat.yml to run the --check mode and make dev/format-changed.sh
delegate to --fix, so both CI and the local helper share identical
line-scoped behavior.
Add varied scenarios to verify the Java Format check flags only PR-edited
lines:
- FormatProbeErrors: new file with four distinct code-style violations
- FormatProbeOk2: new fully-compliant file (must pass)
- FormatProbeClean: append one clean and one broken method (only the
  broken one must be flagged)
- DMLRuntimeException: break only the previously-clean describe line so its
  pre-existing unformatted lines must remain unflagged
Introduce dev/format-exclude.txt, a list of glob patterns for files whose
non-standard layout is intentional and must not be reformatted (e.g.
DMLConfig.java's hand-aligned constants). dev/format_changed.py loads the
list and skips matching files in both --check and --fix, matching each
pattern against the repo-relative path and the bare file name.
Address review findings on the line-scoped format check:

- Always restore working-tree files via try/finally so an mvn failure or
  interrupt never leaves whole-file reformatting on disk (previously the
  restore only ran on the happy path, risking loss of uncommitted/untracked
  content in --check and --fix).
- Fail loudly on git errors instead of swallowing stderr and reporting a
  false "clean" pass; validate the resolved base ref up front.
- Reject unknown flags and extra positional args instead of silently
  treating them as the base ref.
- Extract pure helpers (parse_hunks, line_scoped_result, overlaps) and add
  dev/tests/test_format_changed.py; run it in CI via a pytest step. Batch the
  per-file git diff into a single call.
- Byte-exact file IO (newline="") so restores never rewrite line endings;
  use context-managed reads/writes; dedupe the src-root prefix; emit a
  friendly message when mvn is unavailable.
- Add permissions: contents: read and expand the workflow trigger paths to
  cover the checker and its tests.
Consolidate the two Java style gates into one workflow file with two jobs
so each still reports its own status:
- java_checkstyle: whole-tree checkstyle, on push and pull_request
- java_format: line-scoped Eclipse-formatter check plus its pytest, gated to
  pull_request since it needs the PR base commit

Remove the standalone javaFormat.yml.
@Baunsgaard

Copy link
Copy Markdown
Contributor Author

@janniklinde want to take a look? now it is targeting the exact diff you have as subparts of the files you edited.

The Apache RAT license check flagged dev/format-exclude.txt for a missing
license header. Add the standard ASF header as comment lines; the exclude
parser already ignores '#' lines, so behavior is unchanged.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

2 participants