From 08f1c17f07de06492a5f83e6d7763e2699ff9527 Mon Sep 17 00:00:00 2001 From: stacknil Date: Fri, 3 Jul 2026 18:44:36 +0800 Subject: [PATCH] docs(release): add v0.5 evidence explainability notes --- CHANGELOG.md | 42 ++++++++++++--- README.md | 8 +-- docs/release-v0.5.0.md | 107 +++++++++++++++++++++++++++++++++++++++ docs/report-artifacts.md | 19 ++++++- docs/reviewer-path.md | 21 ++++++++ 5 files changed, 185 insertions(+), 12 deletions(-) create mode 100644 docs/release-v0.5.0.md diff --git a/CHANGELOG.md b/CHANGELOG.md index 513e38b..5f4a3f1 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -6,14 +6,7 @@ All notable user-visible changes should be recorded here. ### Added -- Added sanitized golden `report.md` / `report.json` regression fixtures to lock report contracts. -- Added `schema` and `schema_version` fields to `report.json` so downstream tooling can identify the report artifact contract. -- Added `verdict_boundary` to JSON findings and advanced the report artifact contract to `loglens.report.v2`. -- Expanded parser coverage for `Accepted publickey` and selected `pam_faillock` / `pam_sss` variants. -- Added a 150-line sanitized mixed auth corpus fixture covering Ubuntu / Debian-style `auth.log`, RHEL-family `secure`-style syslog, unknown lines, malformed source IPs, and blank-line handling. -- Added a reviewer-facing parser coverage JSON artifact for the mixed auth corpus. -- Added compact host-level summaries for multi-host reports. -- Added optional CSV export for findings and warnings when explicitly requested. +- None yet. ### Changed @@ -25,10 +18,43 @@ All notable user-visible changes should be recorded here. ### Docs +- None yet. + +## v0.5.0 + +### Added + +- Stabilized the JSON report artifact contract as `loglens.report.v2` with + `schema_version` set to `2`. +- Added finding explainability fields to JSON findings, including `rule_id`, + `subject_kind`, `subject`, `grouping_key`, `window_start`, `window_end`, + `threshold`, `observed_count`, `evidence_event_ids`, and `verdict_boundary`. +- Added sanitized golden `report.md`, `report.json`, `findings.csv`, and + `warnings.csv` regression fixtures to lock report contracts. +- Added a 150-line sanitized mixed auth corpus fixture covering Ubuntu / + Debian-style `auth.log`, RHEL-family `secure`-style syslog, unknown lines, + malformed source IPs, and blank-line handling. +- Added a reviewer-facing parser coverage JSON artifact for the mixed auth + corpus. + +### Changed + +- Made parser coverage telemetry and finding explainability part of the + release-facing review path instead of internal-only implementation detail. + +### Fixed + +- None. + +### Docs + +- Added release notes for the v0.5 Evidence Explainability Release. - Added a one-page incident-style case that traces raw SSH evidence through normalized events and finding fields to a bounded conclusion. - Added a rule-by-rule false-positive taxonomy for NAT, bastion, internal scanner, lab replay, scheduled admin task, and shared-account contexts. +- Added forensic-style case-study coverage for Linux auth brute-force evidence + interpretation. - Expanded the parser conformance matrix with explicit Ubuntu / Debian `auth.log`, RHEL-family `secure`, `journalctl --output=short-full`, `sshd`, `sudo`, `pam_unix`, `pam_faillock`, and `pam_sss` style coverage. diff --git a/README.md b/README.md index be48129..592ad21 100644 --- a/README.md +++ b/README.md @@ -9,13 +9,15 @@ It parses `auth.log` / `secure`-style syslog input and `journalctl --output=shor ## Example Finding -A compact finding summary is a bounded triage signal, not attribution: +A compact finding summary is a bounded triage signal, not a compromise verdict +or attribution: ```json { "rule_id": "brute_force", "subject_kind": "source_ip", "subject": "203.0.113.10", + "grouping_key": "source_ip", "window_start": "2026-03-10 08:11:22", "window_end": "2026-03-10 08:18:05", "threshold": 5, @@ -29,7 +31,7 @@ A compact finding summary is a bounded triage signal, not attribution: LogLens is an MVP / early release. The repository is stable enough for public review, local experimentation, and extension, but the parser and detection coverage are intentionally narrow. -Reviewing the project quickly? Start with [`docs/reviewer-path.md`](./docs/reviewer-path.md), [`docs/reviewer-brief.md`](./docs/reviewer-brief.md), and the [`quality gates map`](./docs/quality-gates.md). For detection reasoning, follow the [`one-page incident-style case`](./docs/incident-style-case.md), then use the full [`Linux auth brute-force case study`](./docs/case-study-linux-auth-bruteforce.md), [`rule catalog`](./docs/rule-catalog.md), and [`false-positive taxonomy`](./docs/false-positive-taxonomy.md) for depth. For local scale expectations, see the [`performance envelope`](./docs/performance-envelope.md). +Reviewing the project quickly? Start with [`docs/reviewer-path.md`](./docs/reviewer-path.md), [`docs/reviewer-brief.md`](./docs/reviewer-brief.md), and the [`v0.5 Evidence Explainability release note`](./docs/release-v0.5.0.md). The [`quality gates map`](./docs/quality-gates.md) links claims to tests and fixtures. For detection reasoning, follow the [`one-page incident-style case`](./docs/incident-style-case.md), then use the full [`Linux auth brute-force case study`](./docs/case-study-linux-auth-bruteforce.md), [`rule catalog`](./docs/rule-catalog.md), and [`false-positive taxonomy`](./docs/false-positive-taxonomy.md) for depth. For local scale expectations, see the [`performance envelope`](./docs/performance-envelope.md). ## Why This Project Exists @@ -56,7 +58,7 @@ LogLens includes two minimal GitHub Actions workflows: - `CI` builds and tests the project on `ubuntu-latest` and `windows-latest` - `CodeQL` runs GitHub code scanning for C/C++ on pushes, pull requests, and a weekly schedule -Both workflows are intended to stay stable enough to require on pull requests to `main`. Regression coverage is backed by sanitized parser fixture matrices plus golden report-contract fixtures for `report.md`, `report.json`, and optional CSV outputs. Release-facing documentation is split across [`CHANGELOG.md`](./CHANGELOG.md), [`docs/release-process.md`](./docs/release-process.md), [`docs/release-v0.1.0.md`](./docs/release-v0.1.0.md), [`docs/release-v0.3.0.md`](./docs/release-v0.3.0.md), and the repository's GitHub release notes. The repository hardening note is in [`docs/repo-hardening.md`](./docs/repo-hardening.md), and vulnerability reporting guidance is in [`SECURITY.md`](./SECURITY.md). +Both workflows are intended to stay stable enough to require on pull requests to `main`. Regression coverage is backed by sanitized parser fixture matrices plus golden report-contract fixtures for `report.md`, `report.json`, and optional CSV outputs. Release-facing documentation is split across [`CHANGELOG.md`](./CHANGELOG.md), [`docs/release-process.md`](./docs/release-process.md), [`docs/release-v0.1.0.md`](./docs/release-v0.1.0.md), [`docs/release-v0.3.0.md`](./docs/release-v0.3.0.md), [`docs/release-v0.5.0.md`](./docs/release-v0.5.0.md), and the repository's GitHub release notes. The repository hardening note is in [`docs/repo-hardening.md`](./docs/repo-hardening.md), and vulnerability reporting guidance is in [`SECURITY.md`](./SECURITY.md). ## Threat Model diff --git a/docs/release-v0.5.0.md b/docs/release-v0.5.0.md new file mode 100644 index 0000000..cb11ac1 --- /dev/null +++ b/docs/release-v0.5.0.md @@ -0,0 +1,107 @@ +# LogLens v0.5.0 + +LogLens v0.5.0 is the Evidence Explainability Release. + +This release makes the path from raw authentication evidence to bounded triage +findings easier to review. The focus is not adding more rules; it is making +parser behavior, report contracts, evidence IDs, and non-claims visible enough +for reviewers to verify. + +## Highlights + +- Stabilized the JSON report contract as `loglens.report.v2` with + `schema_version` set to `2`. +- Added stable finding explainability fields so a finding can be traced back to + its rule context and source-line evidence. +- Added a sanitized 150-line mixed auth corpus and checked-in parser coverage + artifact for dirty syslog-style input. +- Added false-positive taxonomy and forensic-style case-study documentation for + evidence interpretation. + +## Stable JSON contract + +`report.json` now identifies the report artifact contract with: + +- `schema`: `loglens.report.v2` +- `schema_version`: `2` + +Finding objects expose a stable explainability surface: + +- `rule_id` +- `subject_kind` +- `subject` +- `grouping_key` +- `window_start` +- `window_end` +- `threshold` +- `observed_count` +- `evidence_event_ids` +- `verdict_boundary` + +`evidence_event_ids` are deterministic local IDs such as `line:1`. They help a +reviewer trace the selected rule window back to source log lines without +claiming global event identity. + +`verdict_boundary` is a machine-readable non-claim token. It keeps report output +aligned with LogLens's triage scope instead of letting a finding read like an +incident conclusion. + +The contract is backed by golden fixtures for `report.md`, `report.json`, +`findings.csv`, and `warnings.csv` in +[`tests/fixtures/report_contracts`](../tests/fixtures/report_contracts). Parser +or rule changes that alter those artifacts must update the snapshots +explicitly. + +## Parser observability artifacts + +This release adds two reviewer-facing mixed-input artifacts: + +- [`assets/mixed_auth_corpus.log`](../assets/mixed_auth_corpus.log) +- [`assets/mixed_auth_parser_coverage.json`](../assets/mixed_auth_parser_coverage.json) + +The corpus is sanitized and intentionally mixed: Ubuntu / Debian-style +`auth.log`, RHEL-family `secure`-style syslog, unsupported lines, malformed +source IPs, and blank-line handling are represented together. + +The parser coverage artifact lets reviewers inspect parser observability without +running the tool first. It exposes fields such as `total_input_lines`, +`parsed_lines`, `unparsed_lines`, `failure_categories`, and +`top_unknown_patterns`. + +## Evidence interpretation docs + +The release-facing review path now includes: + +- [`docs/parser-contract.md`](./parser-contract.md) for supported inputs, + normalized event families, parser warning categories, and detection signal + boundaries. +- [`docs/report-artifacts.md`](./report-artifacts.md) for JSON, Markdown, and + CSV artifact contracts. +- [`docs/false-positive-taxonomy.md`](./false-positive-taxonomy.md) for benign + or ambiguous contexts such as NAT, bastion, internal scanner, lab replay, + scheduled admin task, and shared account behavior. +- [`docs/incident-style-case.md`](./incident-style-case.md) for a compact trace + from raw log lines to normalized events, finding fields, and bounded + conclusion. +- [`docs/case-study-linux-auth-bruteforce.md`](./case-study-linux-auth-bruteforce.md) + for a longer forensic-style evidence explanation. + +## Non-claims + +LogLens findings remain bounded triage signals. This release does not claim: + +- no compromise verdict +- no attribution +- no blocking recommendation +- no cross-host correlation + +In practical terms, a finding can show that supported evidence met a configured +rule threshold. It does not decide whether a host was compromised, who operated +the source, whether an address should be blocked, or whether activity across +hosts is related. + +## Upgrade notes + +No CLI migration is required for local users. Downstream consumers of +`report.json` should key off `schema` and `schema_version`, and should update +their snapshots if they depend on finding object shape. diff --git a/docs/report-artifacts.md b/docs/report-artifacts.md index 710128e..b746896 100644 --- a/docs/report-artifacts.md +++ b/docs/report-artifacts.md @@ -42,6 +42,23 @@ The JSON report keeps parser observability visible next to findings: Finding objects contain `rule_id`, `rule`, `subject_kind`, `subject`, `grouping_key`, `threshold`, `observed_count`, `event_count`, `window_start`, `window_end`, `evidence_event_ids`, `verdict_boundary`, `usernames`, and `summary`. +The stable finding explainability surface for `loglens.report.v2` is: + +- `rule_id` +- `subject_kind` +- `subject` +- `grouping_key` +- `window_start` +- `window_end` +- `threshold` +- `observed_count` +- `evidence_event_ids` +- `verdict_boundary` + +These fields are release-facing contract fields. Parser or rule changes that +alter their names, meanings, values, or presence must update the golden report +fixtures explicitly. + `evidence_event_ids` are deterministic local event identifiers derived from the source line number, formatted as `line:`. They let reviewers trace a finding back to the normalized input events that satisfied the rule window without implying global event identity. `verdict_boundary` is a stable token that states what the finding must not be @@ -90,7 +107,7 @@ The report contracts are backed by generated fixture artifacts: | [`multi_host_syslog_legacy`](../tests/fixtures/report_contracts/multi_host_syslog_legacy) | `report.md`, `report.json`, `findings.csv`, `warnings.csv` | | [`multi_host_journalctl_short_full`](../tests/fixtures/report_contracts/multi_host_journalctl_short_full) | `report.md`, `report.json`, `findings.csv`, `warnings.csv` | -The enforcement lives in [`tests/test_report_contracts.cpp`](../tests/test_report_contracts.cpp). Parser or rule changes that alter report artifacts must update these snapshots explicitly. The focused report writer tests live in [`tests/test_report.cpp`](../tests/test_report.cpp). +The enforcement lives in [`tests/test_report_contracts.cpp`](../tests/test_report_contracts.cpp). Parser or rule changes that alter report artifacts must update these snapshots explicitly. This includes changes to stable finding explainability fields, parser coverage fields, warning categories, CSV columns, or Markdown report layout. The focused report writer tests live in [`tests/test_report.cpp`](../tests/test_report.cpp). ## Boundaries diff --git a/docs/reviewer-path.md b/docs/reviewer-path.md index be4989e..1673c7d 100644 --- a/docs/reviewer-path.md +++ b/docs/reviewer-path.md @@ -7,6 +7,7 @@ This path is for reviewers who want to understand LogLens quickly without readin | Review question | Start here | Good stopping point | | --- | --- | --- | | What is LogLens? | [`README.md`](../README.md) and [`docs/reviewer-brief.md`](./reviewer-brief.md) | Can state scope, supported inputs, outputs, and non-goals | +| What changed in v0.5? | [`docs/release-v0.5.0.md`](./release-v0.5.0.md) | Can explain the Evidence Explainability Release theme and its non-claims | | What log formats are supported? | [`docs/parser-contract.md`](./parser-contract.md) | Can name `syslog_legacy` and `journalctl_short_full` behavior | | What artifacts does it produce? | [`docs/report-artifacts.md`](./report-artifacts.md) and report-contract fixtures | Can inspect Markdown, JSON, and optional CSV outputs | | How do rules use evidence? | [`docs/rule-catalog.md`](./rule-catalog.md) | Can explain grouping keys, windows, thresholds, and unsupported-evidence boundaries | @@ -23,6 +24,7 @@ Read: - [`README.md`](../README.md) - [`docs/reviewer-brief.md`](./reviewer-brief.md) +- [`docs/release-v0.5.0.md`](./release-v0.5.0.md) Confirm: @@ -35,6 +37,23 @@ Core review lens: > Parser observability > silent detection claims. +## v0.5 release-facing route + +Start with [`docs/release-v0.5.0.md`](./release-v0.5.0.md), then inspect: + +- [`docs/parser-contract.md`](./parser-contract.md) +- [`docs/report-artifacts.md`](./report-artifacts.md) +- [`tests/fixtures/report_contracts/syslog_legacy/report.json`](../tests/fixtures/report_contracts/syslog_legacy/report.json) +- [`assets/mixed_auth_corpus.log`](../assets/mixed_auth_corpus.log) +- [`assets/mixed_auth_parser_coverage.json`](../assets/mixed_auth_parser_coverage.json) +- [`docs/false-positive-taxonomy.md`](./false-positive-taxonomy.md) +- [`docs/case-study-linux-auth-bruteforce.md`](./case-study-linux-auth-bruteforce.md) + +Good stopping point: the reviewer can name the stable finding explainability +fields, explain how parser coverage remains visible for unknown lines, and state +that findings are bounded triage signals with no compromise verdict, +attribution, blocking recommendation, or cross-host correlation claim. + ## 5-minute artifact review Inspect: @@ -43,8 +62,10 @@ Inspect: - [`assets/sample_journalctl_short_full.log`](../assets/sample_journalctl_short_full.log) - [`tests/fixtures/report_contracts/syslog_legacy/report.md`](../tests/fixtures/report_contracts/syslog_legacy/report.md) - [`tests/fixtures/report_contracts/syslog_legacy/report.json`](../tests/fixtures/report_contracts/syslog_legacy/report.json) +- [`docs/release-v0.5.0.md`](./release-v0.5.0.md) - [`docs/report-artifacts.md`](./report-artifacts.md) - [`docs/parser-contract.md`](./parser-contract.md) +- [`assets/mixed_auth_corpus.log`](../assets/mixed_auth_corpus.log) - [`assets/mixed_auth_parser_coverage.json`](../assets/mixed_auth_parser_coverage.json) - [`docs/quality-gates.md`](./quality-gates.md) - [`docs/incident-style-case.md`](./incident-style-case.md)