Skip to content

feat(neo4j): project analysis.json into a Neo4j property graph#33

Open
rahlk wants to merge 2 commits into
mainfrom
feature/neo4j
Open

feat(neo4j): project analysis.json into a Neo4j property graph#33
rahlk wants to merge 2 commits into
mainfrom
feature/neo4j

Conversation

@rahlk

@rahlk rahlk commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Summary

Ports the codeanalyzer-typescript v0.4.0 Neo4j feature to Python, keeping the same CLI arg entrypoints. One in-memory analysis can now be emitted three ways via --emit: the canonical analysis.json (default), a Neo4j property graph, or the version-stamped schema contract.

What's included

New codeanalyzer/neo4j/ package (mirrors src/build/neo4j/):

  • catalog.py — declarative schema catalog (single source of truth), SCHEMA_VERSION = 1.0.0
  • project.py — pure projection PyApplication → graph rows
  • cypher.py — self-contained graph.cypher snapshot writer
  • bolt.py — incremental live-push writer (lazy neo4j import; content-hash module diffing, vanished-decl cleanup, full-run orphan prune)
  • rows.py / schema.py / emit.py

CLI--emit {json,neo4j,schema}, --app-name, --neo4j-uri/-user/-password/-database. -i/--input is now optional for --emit schema. -f/--format msgpack retained on the json path.

Emit shapes

  • --emit neo4j -o ./out./out/graph.cypher (no driver needed)
  • --emit neo4j --neo4j-uri bolt://… → incremental Bolt push (needs the [neo4j] extra)
  • --emit schemaschema.json (no project required; checked in as schema.neo4j.json)

Tests

  • test/test_neo4j_schema.py — always-on anti-drift conformance guard (emitter never produces a label/rel/property the catalog doesn't declare; checked-in schema.neo4j.json stays current)
  • test/test_neo4j_bolt.py — opt-in (RUN_CONTAINER_TESTS=1) Neo4j Testcontainers integration test: full push, idempotent re-push, vanished-module prune

Packaging / release / docs

  • packaging/install/codeanalyzer-installer.shcurl | sh installer (uv / pipx / pip)
  • release.yml — syncs schema.neo4j.json from source, publishes schema.json + the installer as Release assets
  • README consolidated (it was duplicated) with Neo4j docs; CHANGELOG 0.2.0; version bump 0.1.15 → 0.2.0
  • schema-uml.drawio — UML of the PyApplication containment tree

Schema relationship to TypeScript

Same design DNA, intentionally a structural subset driven by the Python IR: shared :Symbol merge label, schema_version stamp, first-class CallSite/Variable/Attribute/Decorator nodes, _module provenance, identical DDL and incremental algorithm. Python has 10 node labels / 11 relationships vs TS's 14 / 14 (no interfaces/enums/type-aliases/namespaces, exports, or entrypoint markers — those aren't in the IR yet).

Verification

8/8 non-container tests pass (3 existing CLI + 5 schema). --emit schema and --emit neo4j exercised end-to-end (valid graph.cypher produced). The live bolt test runs wherever Docker/Podman is available.

Port the codeanalyzer-typescript v0.4.0 Neo4j feature to Python under the
neo4j feature branch, with the same CLI arg entrypoints.

- codeanalyzer/neo4j: catalog (schema source of truth), project (pure IR ->
  graph rows), cypher snapshot writer, incremental Bolt writer, and the
  output-agnostic rows intermediate. Lazy neo4j-driver import keeps it off the
  default json path.
- CLI: --emit {json,neo4j,schema}, --app-name, --neo4j-uri/-user/-password/
  -database; -i/--input now optional for --emit schema.
- --emit neo4j writes a self-contained graph.cypher, or pushes incrementally to
  a live Neo4j over Bolt; --emit schema emits the version-stamped schema.json
  contract (checked in as schema.neo4j.json).
- Tests: schema-conformance (always runs, anti-drift guard) + opt-in Neo4j
  Testcontainers bolt integration test (RUN_CONTAINER_TESTS=1).
- packaging/install/codeanalyzer-installer.sh: curl|sh installer (uv/pipx/pip).
- release.yml: sync schema.neo4j.json + publish schema.json and installer as
  release assets. README/CHANGELOG updates; schema-uml.drawio. Version 0.2.0.
The four --neo4j-* connection options now fall back to the standard
NEO4J_URI / NEO4J_USERNAME / NEO4J_PASSWORD / NEO4J_DATABASE environment
variables when the flag is omitted (explicit flag wins). Prefer the env var
for the password so it doesn't land in shell history or the process list.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant