Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 34 additions & 8 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,7 @@ All notable user-visible changes should be recorded here.

### Added

- Added sanitized golden `report.md` / `report.json` regression fixtures to lock report contracts.
- Added `schema` and `schema_version` fields to `report.json` so downstream tooling can identify the report artifact contract.
- Added `verdict_boundary` to JSON findings and advanced the report artifact contract to `loglens.report.v2`.
- Expanded parser coverage for `Accepted publickey` and selected `pam_faillock` / `pam_sss` variants.
- Added a 150-line sanitized mixed auth corpus fixture covering Ubuntu / Debian-style `auth.log`, RHEL-family `secure`-style syslog, unknown lines, malformed source IPs, and blank-line handling.
- Added a reviewer-facing parser coverage JSON artifact for the mixed auth corpus.
- Added compact host-level summaries for multi-host reports.
- Added optional CSV export for findings and warnings when explicitly requested.
- None yet.

### Changed

Expand All @@ -25,10 +18,43 @@ All notable user-visible changes should be recorded here.

### Docs

- None yet.

## v0.5.0

### Added

- Stabilized the JSON report artifact contract as `loglens.report.v2` with
`schema_version` set to `2`.
- Added finding explainability fields to JSON findings, including `rule_id`,
`subject_kind`, `subject`, `grouping_key`, `window_start`, `window_end`,
`threshold`, `observed_count`, `evidence_event_ids`, and `verdict_boundary`.
- Added sanitized golden `report.md`, `report.json`, `findings.csv`, and
`warnings.csv` regression fixtures to lock report contracts.
- Added a 150-line sanitized mixed auth corpus fixture covering Ubuntu /
Debian-style `auth.log`, RHEL-family `secure`-style syslog, unknown lines,
malformed source IPs, and blank-line handling.
- Added a reviewer-facing parser coverage JSON artifact for the mixed auth
corpus.

### Changed

- Made parser coverage telemetry and finding explainability part of the
release-facing review path instead of internal-only implementation detail.

### Fixed

- None.

### Docs

- Added release notes for the v0.5 Evidence Explainability Release.
- Added a one-page incident-style case that traces raw SSH evidence through
normalized events and finding fields to a bounded conclusion.
- Added a rule-by-rule false-positive taxonomy for NAT, bastion, internal scanner,
lab replay, scheduled admin task, and shared-account contexts.
- Added forensic-style case-study coverage for Linux auth brute-force evidence
interpretation.
- Expanded the parser conformance matrix with explicit Ubuntu / Debian
`auth.log`, RHEL-family `secure`, `journalctl --output=short-full`, `sshd`,
`sudo`, `pam_unix`, `pam_faillock`, and `pam_sss` style coverage.
Expand Down
8 changes: 5 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,15 @@ It parses `auth.log` / `secure`-style syslog input and `journalctl --output=shor

## Example Finding

A compact finding summary is a bounded triage signal, not attribution:
A compact finding summary is a bounded triage signal, not a compromise verdict
or attribution:

```json
{
"rule_id": "brute_force",
"subject_kind": "source_ip",
"subject": "203.0.113.10",
"grouping_key": "source_ip",
"window_start": "2026-03-10 08:11:22",
"window_end": "2026-03-10 08:18:05",
"threshold": 5,
Expand All @@ -29,7 +31,7 @@ A compact finding summary is a bounded triage signal, not attribution:

LogLens is an MVP / early release. The repository is stable enough for public review, local experimentation, and extension, but the parser and detection coverage are intentionally narrow.

Reviewing the project quickly? Start with [`docs/reviewer-path.md`](./docs/reviewer-path.md), [`docs/reviewer-brief.md`](./docs/reviewer-brief.md), and the [`quality gates map`](./docs/quality-gates.md). For detection reasoning, follow the [`one-page incident-style case`](./docs/incident-style-case.md), then use the full [`Linux auth brute-force case study`](./docs/case-study-linux-auth-bruteforce.md), [`rule catalog`](./docs/rule-catalog.md), and [`false-positive taxonomy`](./docs/false-positive-taxonomy.md) for depth. For local scale expectations, see the [`performance envelope`](./docs/performance-envelope.md).
Reviewing the project quickly? Start with [`docs/reviewer-path.md`](./docs/reviewer-path.md), [`docs/reviewer-brief.md`](./docs/reviewer-brief.md), and the [`v0.5 Evidence Explainability release note`](./docs/release-v0.5.0.md). The [`quality gates map`](./docs/quality-gates.md) links claims to tests and fixtures. For detection reasoning, follow the [`one-page incident-style case`](./docs/incident-style-case.md), then use the full [`Linux auth brute-force case study`](./docs/case-study-linux-auth-bruteforce.md), [`rule catalog`](./docs/rule-catalog.md), and [`false-positive taxonomy`](./docs/false-positive-taxonomy.md) for depth. For local scale expectations, see the [`performance envelope`](./docs/performance-envelope.md).

## Why This Project Exists

Expand All @@ -56,7 +58,7 @@ LogLens includes two minimal GitHub Actions workflows:
- `CI` builds and tests the project on `ubuntu-latest` and `windows-latest`
- `CodeQL` runs GitHub code scanning for C/C++ on pushes, pull requests, and a weekly schedule

Both workflows are intended to stay stable enough to require on pull requests to `main`. Regression coverage is backed by sanitized parser fixture matrices plus golden report-contract fixtures for `report.md`, `report.json`, and optional CSV outputs. Release-facing documentation is split across [`CHANGELOG.md`](./CHANGELOG.md), [`docs/release-process.md`](./docs/release-process.md), [`docs/release-v0.1.0.md`](./docs/release-v0.1.0.md), [`docs/release-v0.3.0.md`](./docs/release-v0.3.0.md), and the repository's GitHub release notes. The repository hardening note is in [`docs/repo-hardening.md`](./docs/repo-hardening.md), and vulnerability reporting guidance is in [`SECURITY.md`](./SECURITY.md).
Both workflows are intended to stay stable enough to require on pull requests to `main`. Regression coverage is backed by sanitized parser fixture matrices plus golden report-contract fixtures for `report.md`, `report.json`, and optional CSV outputs. Release-facing documentation is split across [`CHANGELOG.md`](./CHANGELOG.md), [`docs/release-process.md`](./docs/release-process.md), [`docs/release-v0.1.0.md`](./docs/release-v0.1.0.md), [`docs/release-v0.3.0.md`](./docs/release-v0.3.0.md), [`docs/release-v0.5.0.md`](./docs/release-v0.5.0.md), and the repository's GitHub release notes. The repository hardening note is in [`docs/repo-hardening.md`](./docs/repo-hardening.md), and vulnerability reporting guidance is in [`SECURITY.md`](./SECURITY.md).

## Threat Model

Expand Down
107 changes: 107 additions & 0 deletions docs/release-v0.5.0.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
# LogLens v0.5.0

LogLens v0.5.0 is the Evidence Explainability Release.

This release makes the path from raw authentication evidence to bounded triage
findings easier to review. The focus is not adding more rules; it is making
parser behavior, report contracts, evidence IDs, and non-claims visible enough
for reviewers to verify.

## Highlights

- Stabilized the JSON report contract as `loglens.report.v2` with
`schema_version` set to `2`.
- Added stable finding explainability fields so a finding can be traced back to
its rule context and source-line evidence.
- Added a sanitized 150-line mixed auth corpus and checked-in parser coverage
artifact for dirty syslog-style input.
- Added false-positive taxonomy and forensic-style case-study documentation for
evidence interpretation.

## Stable JSON contract

`report.json` now identifies the report artifact contract with:

- `schema`: `loglens.report.v2`
- `schema_version`: `2`

Finding objects expose a stable explainability surface:

- `rule_id`
- `subject_kind`
- `subject`
- `grouping_key`
- `window_start`
- `window_end`
- `threshold`
- `observed_count`
- `evidence_event_ids`
- `verdict_boundary`

`evidence_event_ids` are deterministic local IDs such as `line:1`. They help a
reviewer trace the selected rule window back to source log lines without
claiming global event identity.

`verdict_boundary` is a machine-readable non-claim token. It keeps report output
aligned with LogLens's triage scope instead of letting a finding read like an
incident conclusion.

The contract is backed by golden fixtures for `report.md`, `report.json`,
`findings.csv`, and `warnings.csv` in
[`tests/fixtures/report_contracts`](../tests/fixtures/report_contracts). Parser
or rule changes that alter those artifacts must update the snapshots
explicitly.

## Parser observability artifacts

This release adds two reviewer-facing mixed-input artifacts:

- [`assets/mixed_auth_corpus.log`](../assets/mixed_auth_corpus.log)
- [`assets/mixed_auth_parser_coverage.json`](../assets/mixed_auth_parser_coverage.json)

The corpus is sanitized and intentionally mixed: Ubuntu / Debian-style
`auth.log`, RHEL-family `secure`-style syslog, unsupported lines, malformed
source IPs, and blank-line handling are represented together.

The parser coverage artifact lets reviewers inspect parser observability without
running the tool first. It exposes fields such as `total_input_lines`,
`parsed_lines`, `unparsed_lines`, `failure_categories`, and
`top_unknown_patterns`.

## Evidence interpretation docs

The release-facing review path now includes:

- [`docs/parser-contract.md`](./parser-contract.md) for supported inputs,
normalized event families, parser warning categories, and detection signal
boundaries.
- [`docs/report-artifacts.md`](./report-artifacts.md) for JSON, Markdown, and
CSV artifact contracts.
- [`docs/false-positive-taxonomy.md`](./false-positive-taxonomy.md) for benign
or ambiguous contexts such as NAT, bastion, internal scanner, lab replay,
scheduled admin task, and shared account behavior.
- [`docs/incident-style-case.md`](./incident-style-case.md) for a compact trace
from raw log lines to normalized events, finding fields, and bounded
conclusion.
- [`docs/case-study-linux-auth-bruteforce.md`](./case-study-linux-auth-bruteforce.md)
for a longer forensic-style evidence explanation.

## Non-claims

LogLens findings remain bounded triage signals. This release does not claim:

- no compromise verdict
- no attribution
- no blocking recommendation
- no cross-host correlation

In practical terms, a finding can show that supported evidence met a configured
rule threshold. It does not decide whether a host was compromised, who operated
the source, whether an address should be blocked, or whether activity across
hosts is related.

## Upgrade notes

No CLI migration is required for local users. Downstream consumers of
`report.json` should key off `schema` and `schema_version`, and should update
their snapshots if they depend on finding object shape.
19 changes: 18 additions & 1 deletion docs/report-artifacts.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,23 @@ The JSON report keeps parser observability visible next to findings:

Finding objects contain `rule_id`, `rule`, `subject_kind`, `subject`, `grouping_key`, `threshold`, `observed_count`, `event_count`, `window_start`, `window_end`, `evidence_event_ids`, `verdict_boundary`, `usernames`, and `summary`.

The stable finding explainability surface for `loglens.report.v2` is:

- `rule_id`
- `subject_kind`
- `subject`
- `grouping_key`
- `window_start`
- `window_end`
- `threshold`
- `observed_count`
- `evidence_event_ids`
- `verdict_boundary`

These fields are release-facing contract fields. Parser or rule changes that
alter their names, meanings, values, or presence must update the golden report
fixtures explicitly.

`evidence_event_ids` are deterministic local event identifiers derived from the source line number, formatted as `line:<number>`. They let reviewers trace a finding back to the normalized input events that satisfied the rule window without implying global event identity.

`verdict_boundary` is a stable token that states what the finding must not be
Expand Down Expand Up @@ -90,7 +107,7 @@ The report contracts are backed by generated fixture artifacts:
| [`multi_host_syslog_legacy`](../tests/fixtures/report_contracts/multi_host_syslog_legacy) | `report.md`, `report.json`, `findings.csv`, `warnings.csv` |
| [`multi_host_journalctl_short_full`](../tests/fixtures/report_contracts/multi_host_journalctl_short_full) | `report.md`, `report.json`, `findings.csv`, `warnings.csv` |

The enforcement lives in [`tests/test_report_contracts.cpp`](../tests/test_report_contracts.cpp). Parser or rule changes that alter report artifacts must update these snapshots explicitly. The focused report writer tests live in [`tests/test_report.cpp`](../tests/test_report.cpp).
The enforcement lives in [`tests/test_report_contracts.cpp`](../tests/test_report_contracts.cpp). Parser or rule changes that alter report artifacts must update these snapshots explicitly. This includes changes to stable finding explainability fields, parser coverage fields, warning categories, CSV columns, or Markdown report layout. The focused report writer tests live in [`tests/test_report.cpp`](../tests/test_report.cpp).

## Boundaries

Expand Down
21 changes: 21 additions & 0 deletions docs/reviewer-path.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ This path is for reviewers who want to understand LogLens quickly without readin
| Review question | Start here | Good stopping point |
| --- | --- | --- |
| What is LogLens? | [`README.md`](../README.md) and [`docs/reviewer-brief.md`](./reviewer-brief.md) | Can state scope, supported inputs, outputs, and non-goals |
| What changed in v0.5? | [`docs/release-v0.5.0.md`](./release-v0.5.0.md) | Can explain the Evidence Explainability Release theme and its non-claims |
| What log formats are supported? | [`docs/parser-contract.md`](./parser-contract.md) | Can name `syslog_legacy` and `journalctl_short_full` behavior |
| What artifacts does it produce? | [`docs/report-artifacts.md`](./report-artifacts.md) and report-contract fixtures | Can inspect Markdown, JSON, and optional CSV outputs |
| How do rules use evidence? | [`docs/rule-catalog.md`](./rule-catalog.md) | Can explain grouping keys, windows, thresholds, and unsupported-evidence boundaries |
Expand All @@ -23,6 +24,7 @@ Read:

- [`README.md`](../README.md)
- [`docs/reviewer-brief.md`](./reviewer-brief.md)
- [`docs/release-v0.5.0.md`](./release-v0.5.0.md)

Confirm:

Expand All @@ -35,6 +37,23 @@ Core review lens:

> Parser observability > silent detection claims.

## v0.5 release-facing route

Start with [`docs/release-v0.5.0.md`](./release-v0.5.0.md), then inspect:

- [`docs/parser-contract.md`](./parser-contract.md)
- [`docs/report-artifacts.md`](./report-artifacts.md)
- [`tests/fixtures/report_contracts/syslog_legacy/report.json`](../tests/fixtures/report_contracts/syslog_legacy/report.json)
- [`assets/mixed_auth_corpus.log`](../assets/mixed_auth_corpus.log)
- [`assets/mixed_auth_parser_coverage.json`](../assets/mixed_auth_parser_coverage.json)
- [`docs/false-positive-taxonomy.md`](./false-positive-taxonomy.md)
- [`docs/case-study-linux-auth-bruteforce.md`](./case-study-linux-auth-bruteforce.md)

Good stopping point: the reviewer can name the stable finding explainability
fields, explain how parser coverage remains visible for unknown lines, and state
that findings are bounded triage signals with no compromise verdict,
attribution, blocking recommendation, or cross-host correlation claim.

## 5-minute artifact review

Inspect:
Expand All @@ -43,8 +62,10 @@ Inspect:
- [`assets/sample_journalctl_short_full.log`](../assets/sample_journalctl_short_full.log)
- [`tests/fixtures/report_contracts/syslog_legacy/report.md`](../tests/fixtures/report_contracts/syslog_legacy/report.md)
- [`tests/fixtures/report_contracts/syslog_legacy/report.json`](../tests/fixtures/report_contracts/syslog_legacy/report.json)
- [`docs/release-v0.5.0.md`](./release-v0.5.0.md)
- [`docs/report-artifacts.md`](./report-artifacts.md)
- [`docs/parser-contract.md`](./parser-contract.md)
- [`assets/mixed_auth_corpus.log`](../assets/mixed_auth_corpus.log)
- [`assets/mixed_auth_parser_coverage.json`](../assets/mixed_auth_parser_coverage.json)
- [`docs/quality-gates.md`](./quality-gates.md)
- [`docs/incident-style-case.md`](./incident-style-case.md)
Expand Down
Loading