Skip to content

feat(neo4j): Neo4j graph output, schema contract, and incremental Bolt writer#152

Open
rahlk wants to merge 1 commit into
mainfrom
feature/neo4j
Open

feat(neo4j): Neo4j graph output, schema contract, and incremental Bolt writer#152
rahlk wants to merge 1 commit into
mainfrom
feature/neo4j

Conversation

@rahlk

@rahlk rahlk commented Jun 20, 2026

Copy link
Copy Markdown
Collaborator

Summary

Ports the codeanalyzer-typescript Neo4j feature to Java, with the same CLI entrypoints. codeanalyzer can now project the analysis IR into a Neo4j property graph instead of analysis.json:

  • --emit neo4j → writes a self-contained graph.cypher snapshot, or (with --neo4j-uri) pushes incrementally to a live Neo4j over Bolt.
  • --emit schema → emits the machine-readable graph contract (schema.json).

New flags: --emit, --app-name, --neo4j-uri, --neo4j-user, --neo4j-password, --neo4j-database.

What's inside

com.ibm.cldk.neo4j package

  • GraphProjector — pure projection of the symbol table (+ level-2 call graph) to graph rows. Type/Callable share a :Symbol identity; call sites, fields, parameters, variables, enum constants and record components are first-class nodes; annotations/packages are shared; entrypoints are a marker label; every unit-owned node carries a _unit provenance property.
  • CypherWriter — self-contained graph.cypher snapshot (constraints, scoped wipe, batched UNWIND … MERGE).
  • BoltWriter — live incremental push over Bolt: diffs each compilation unit's content_hash, replaces only changed units (idempotent MERGE), prunes vanished units on a full run. Uses neo4j-java-driver 4.4.x (kept JDK 11 / native-image compatible).
  • SchemaCatalog + Schema — the in-repo graph contract (labels, relationships, typed properties, DDL); --emit schema serializes it.

Tests

  • Neo4jSchemaConformanceTest (no container) — anti-drift guard: the projector never emits a label/relationship/property the catalog doesn't declare, and schema.neo4j.json is current.
  • Neo4jBoltWriterTest (opt-in, Testcontainers Neo4j) — full push, idempotent re-push, and orphan pruning against a real database. Runs only when RUN_CONTAINER_TESTS is set.

Docs / release / packaging

  • README: install one-liner + Neo4j graph output section + refreshed --help.
  • release.yml: publishes codeanalyzer.jar, schema.json and the installer as release assets, with cargo-dist-style release notes.
  • packaging/install/codeanalyzer-installer.sh: curl/wget installer that fetches the jar and drops a codeanalyzer launcher on PATH.
  • neo4j-schema.drawio: diagram of the emitted property-graph schema.
  • schema.neo4j.json: checked-in graph contract. Version bumped 2.3.8 → 2.4.0.

Verification

  • compileJava / compileTestJava clean; spotlessCheck clean.
  • --emit schema produces the contract; --emit neo4j produces a valid graph.cypher on the call-graph-test fixture.
  • Neo4jSchemaConformanceTest passes.
  • The Bolt Testcontainers suite is opt-in (needs Docker/Podman): RUN_CONTAINER_TESTS=1 ./gradlew test.

Port the codeanalyzer-typescript 0.4.0 Neo4j feature to Java with the same
arg entrypoints:

  --emit json|neo4j|schema  (default json)
  --app-name, --neo4j-uri, --neo4j-user, --neo4j-password, --neo4j-database

New com.ibm.cldk.neo4j package:
  - GraphProjector: pure projection of the symbol table (+ level-2 call graph)
    to graph rows. Type/Callable share a :Symbol identity; call sites, fields,
    parameters, variables, enum constants, record components are first-class
    nodes; annotations/packages are shared; entrypoints are a marker label;
    every unit-owned node carries a _unit provenance prop.
  - CypherWriter: self-contained graph.cypher snapshot (constraints, scoped
    wipe, batched UNWIND/MERGE).
  - BoltWriter: live incremental push over Bolt — diffs each compilation unit's
    content_hash, replaces only changed units (idempotent MERGE), prunes
    vanished units on a full run. Uses neo4j-java-driver 4.4.x (JDK 11/native).
  - SchemaCatalog + Schema: the in-repo graph contract (labels, relationships,
    typed properties, DDL); --emit schema serializes it to schema.json.

Tests:
  - Neo4jSchemaConformanceTest (no container): anti-drift guard asserting the
    projector never emits a label/rel/property the catalog doesn't declare, and
    that schema.neo4j.json is current.
  - Neo4jBoltWriterTest (opt-in, Testcontainers Neo4j): full push, idempotent
    re-push, and orphan pruning against a real database. Runs only when
    RUN_CONTAINER_TESTS is set.

Docs/release/packaging:
  - README: install one-liner + Neo4j graph output section + refreshed --help.
  - release.yml: publish codeanalyzer.jar, schema.json and the installer as
    release assets, with cargo-dist-style release notes.
  - packaging/install/codeanalyzer-installer.sh: curl/wget installer that fetches
    the jar and drops a `codeanalyzer` launcher on PATH.
  - neo4j-schema.drawio: diagram of the emitted property-graph schema.
  - schema.neo4j.json: checked-in graph contract. Bump version to 2.4.0.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant