diff --git a/docs/dev/adrs/suggestions/dataset-driven-fit-modes.md b/docs/dev/adrs/accepted/dataset-driven-fit-modes.md similarity index 62% rename from docs/dev/adrs/suggestions/dataset-driven-fit-modes.md rename to docs/dev/adrs/accepted/dataset-driven-fit-modes.md index 990ca85a5..561b93618 100644 --- a/docs/dev/adrs/suggestions/dataset-driven-fit-modes.md +++ b/docs/dev/adrs/accepted/dataset-driven-fit-modes.md @@ -2,7 +2,7 @@ ## Status -Proposed. +Accepted. ## Date @@ -16,9 +16,9 @@ Analysis and fitting. The analysis layer offers three fit modes through the `fitting_mode` switchable category established by -[`fit-mode-categories`](accepted/fit-mode-categories.md): `single`, -`joint`, and `sequential`. Two problems make the current surface -confusing and partly incorrect. +[`fit-mode-categories`](fit-mode-categories.md): `single`, `joint`, and +`sequential`. Two problems make the current surface confusing and partly +incorrect. **`single` is overloaded.** It is the friendly name for the one-dataset case, but it _also_ silently loops over multiple loaded experiments, @@ -53,7 +53,7 @@ one-dataset case — which removes the buggy multi-loop and closes issue 85 — (3) keeps `sequential` as the folder sweep it already is, and (4) tidies the `sequential` data-source configuration (sensible defaults, an optional copy-into-project flag, and room for a future remote source). -It extends [`fit-mode-categories`](accepted/fit-mode-categories.md). +It extends [`fit-mode-categories`](fit-mode-categories.md). An earlier draft of this ADR proposed redefining `sequential` to fit the loaded datasets in turn; that direction was dropped (see Alternatives @@ -64,26 +64,47 @@ behaviour and solving issue 85 by restricting `single`. ### 1. Mode availability is precondition-based, not a static list -Each fit mode declares a **precondition predicate** — "can I run on this -project right now?" — and `fitting_mode.show_supported()` lists exactly -the modes whose preconditions the current project satisfies. This is -wired through the existing switchable-category selector +Two distinct concepts are separated explicitly so that offering a mode +and being able to run it do not collapse into one rule: + +- **Applicability** — "could this mode apply to the project as loaded?" + This drives `fitting_mode.show_supported()`. +- **Readiness** — "is this mode fully configured to run right now?" This + is checked only at `fit()` time and produces the Decision 6 errors. + +`fitting_mode.show_supported()` lists exactly the modes whose +**applicability** predicate the current project satisfies. This is wired +through the existing switchable-category selector (`FittingMode._supported_types(filters)`, which today ignores its `filters`); it now consumes project state. No new owner-level setter is added — the category-owned-selector contract from -[`switchable-category-owned-selectors`](accepted/switchable-category-owned-selectors.md) +[`switchable-category-owned-selectors`](switchable-category-owned-selectors.md) is preserved. -Preconditions: - -- `single` → exactly one experiment with measured data. -- `joint` → two or more experiments with measured data. -- `sequential` → exactly one experiment with measured data (the - template) plus a resolvable data source (checked fully at fit time; - see Decision 4). - -The availability table is a **consequence** of these predicates, not a -hard-coded rule: +**Applicability** predicates (drive `show_supported()`) are by **total +loaded-experiment count**: + +- `single` → exactly one loaded experiment. +- `joint` → two or more loaded experiments. +- `sequential` → exactly one loaded experiment (the template). It + deliberately does **not** require a configured data source, so + `sequential` is offered as soon as one dataset is loaded — preserving + the intended workflow of switching to it and _then_ pointing it at a + folder. + +**Readiness** (checked at `fit()` time, see Decisions 4 and 6) covers +everything beyond the count: each scheduled experiment must have +measured data (enforced by the existing `Fitter._require_measured_data` +guard — a calculated-only experiment yields a clear fit-time error, not +a hidden mode), and `sequential` additionally needs a resolvable +`data_dir` that matches at least one file. An unconfigured or empty +source is a clear fit-time error, **not** a reason to hide the mode. +Counting _loaded_ (not _measured_) experiments for applicability keeps +`show_supported()` and `fit()` consistent for mixed measured/calculated +projects without any "schedule only the measured subset" filtering. + +The availability table is a **consequence** of the applicability +predicates, not a hard-coded rule: | Experiments loaded | Available modes | | ------------------ | ---------------------- | @@ -91,12 +112,14 @@ hard-coded rule: | 1 | `single`, `sequential` | | ≥ 2 | `joint` | -Predicate-based detection is preferred over a central `if count >= 2` -switch because it is **honest** (it can also reflect, e.g., an -experiment with no measured data, not just a count), **extensible** (a -future remote data source simply becomes another way `sequential`'s -"resolvable data source" precondition is met — see Deferred Work), and -keeps the selector contract clean. +Per-mode predicates are preferred over a central `if count >= 2` switch +because they keep each mode's rule next to the mode and are +**extensible** (a future remote data source becomes another way +`sequential`'s readiness is satisfied — see Deferred Work) without +touching a shared branch. Measured-data presence is deliberately **not** +an applicability input — it is a fit-time readiness check — so +`show_supported()` never hides a mode because an experiment lacks +measured data. ### 2. Restrict `single` to exactly one loaded experiment @@ -118,17 +141,53 @@ sweep itself. The `sequential_fit` category keeps `data_dir`, `file_pattern`, `max_workers`, `chunk_size`, and `reverse`, with these refinements: -- **`file_pattern` default derived from the template.** Default the glob - to the loaded template experiment's own data-file extension (load - `.xye` → default `*.xye`); fall back to `*` only when the extension is - unknown. Zero-config for the common case. +- **`file_pattern` keeps its `'*'` default (first step).** Deriving the + glob from the loaded template experiment's data-file extension is + **deferred** (see Deferred Work): the experiment model does not retain + its source data-file path today, so the extension is unavailable + without new source-path metadata. The shipped first-step default is + therefore `'*'` (the explicit fallback); the derived default is a + tracked follow-up. - **No smart default for `data_dir`.** It stays unset by default; a silent auto-pickup of files from a guessed folder would be surprising. + An unset `data_dir` is a clear fit-time error (Decision 6). - **`copy_data` (new boolean, default `False`).** When `False` (default), matched files are referenced in place; when `True`, the matched files are copied into the project so it is self-contained. Default-off avoids surprising large copies for thousand-file series, - while letting users opt into a portable, archived project. + while letting users opt into a portable, archived project. To avoid + shipping a decided field with an undefined contract, the **minimal + first-step behaviour is fully specified here**: + - **Timing.** The copy happens at `fit()` time, during `sequential` + readiness resolution, _before_ the sweep begins — not at config time + (so it always reflects the `data_dir`/`file_pattern` in effect for + that run). + - **Destination.** A fixed project-relative folder + (`/data/sequential/`); the run then reads its inputs from + there. The destination is derived, not separately configurable. + - **Conflict policy.** Idempotent overwrite: a destination file of the + same name is overwritten so the in-project copy always matches the + current source. (Users with very large series leave `copy_data` + off.) + - **Round-trip / portability contract.** Once a copy succeeds, the + persisted `data_dir` is **rewritten to the project-relative copy + destination** (`data/sequential/`). The saved project is therefore + self-contained: on reload — even moved or shared, with the original + external source gone — `sequential` readiness resolves against the + in-project copy. `file_pattern` persists as set (the copied files + keep their names, so it still matches). The copy is **idempotent**: + when the resolved source directory is already the copy destination + (the post-reload case, where `data_dir` already points at + `data/sequential/`), the copy is skipped and the run uses the + archived files. Re-running `fit()` against a fresh external + `data_dir` re-copies and refreshes the archive. + - **What is serialized.** The `sequential_fit` fields — including + `copy_data` and the (possibly rewritten) `data_dir` / `file_pattern` + — are written to CIF as for any category. When `copy_data=False`, + `data_dir` persists exactly as the user set it (reference in place); + when `copy_data=True`, it persists as the in-project destination per + the round-trip contract above. The copied data files themselves are + project artifacts, not CIF content. ### 5. Close issue 85 by removing `single`-with-N @@ -235,11 +294,6 @@ safer and more discoverable. ## Open Questions -- **`copy_data` mechanics.** When the copy happens (at config time vs at - fit time), what is stored after a copy (the in-project path vs the - original reference), and the overwrite/dedup policy. The field and its - default-off behaviour are decided here; the copy mechanism may be a - small follow-up. - **Resume.** Resume is currently "single mode only" (`_validate_fit_request`). Confirm `single` (one dataset) keeps resume as today; any per-point resume for `sequential` is out of scope. @@ -254,6 +308,14 @@ safer and more discoverable. satisfy the `sequential` "resolvable data source" precondition, reusing the same mode and `results.csv` evolution output without reopening the mode design. -- The `copy_data` copy mechanism, if not implemented in the first step. +- Advanced `copy_data` policies beyond the first-step contract in + Decision 4 (e.g. content-hash dedup, incremental sync, a configurable + destination) — the default-off flag and minimal overwrite contract + ship in the first step. +- **Template-derived `file_pattern` default.** Deferred from Decision 4: + retain the template experiment's source data-file path as new + experiment metadata (with persistence and tests), then default the + glob to its extension (`.xye` → `*.xye`). The first step ships the + `'*'` default. - Detailed result-file/export layout remains governed by - [`fit-output-files-and-data-exports`](suggestions/fit-output-files-and-data-exports.md). + [`fit-output-files-and-data-exports`](../suggestions/fit-output-files-and-data-exports.md). diff --git a/docs/dev/adrs/accepted/type-neutral-adp-parameters.md b/docs/dev/adrs/accepted/type-neutral-adp-parameters.md index 867be53aa..99b8f2a5c 100644 --- a/docs/dev/adrs/accepted/type-neutral-adp-parameters.md +++ b/docs/dev/adrs/accepted/type-neutral-adp-parameters.md @@ -83,5 +83,3 @@ from `beta`. Implications specific to β: (`U_ij = β_ij/(2π²·a*_i·a*_j)`) during structure-CIF generation, then β is restored. The round-trip is mathematically exact, so the net behaviour is β-in/β-out. - -Plan: [`adp-beta-tensor.md`](../../plans/adp-beta-tensor.md). diff --git a/docs/dev/adrs/index.md b/docs/dev/adrs/index.md index c150705c0..98c589831 100644 --- a/docs/dev/adrs/index.md +++ b/docs/dev/adrs/index.md @@ -20,7 +20,7 @@ folders. | Analysis and fitting | Accepted | Analysis CIF Fit State | Defines the persisted fit-state projection in `analysis/analysis.cif` and `analysis/mcmc.h5`. | [`analysis-cif-fit-state.md`](accepted/analysis-cif-fit-state.md) | | Analysis and fitting | Accepted | Parameter Correlation Persistence | Persists deterministic and posterior correlation summaries in `_fit_parameter_correlation` | [`parameter-correlation-persistence.md`](accepted/parameter-correlation-persistence.md) | | Analysis and fitting | Suggestion | Fit Output Files and Data Exports | Narrows remaining archive/export questions after adopting `results.csv` and `mcmc.h5`. | [`fit-output-files-and-data-exports.md`](suggestions/fit-output-files-and-data-exports.md) | -| Analysis and fitting | Suggestion | Dataset-Driven Fit Mode Availability | Offers fit modes by per-mode preconditions, restricts `single` to one dataset (closing issue 85), and keeps `sequential` as the folder sweep. | [`dataset-driven-fit-modes.md`](suggestions/dataset-driven-fit-modes.md) | +| Analysis and fitting | Accepted | Dataset-Driven Fit Mode Availability | Offers fit modes by per-mode preconditions, restricts `single` to one dataset (closing issue 85), and keeps `sequential` as the folder sweep. | [`dataset-driven-fit-modes.md`](accepted/dataset-driven-fit-modes.md) | | Analysis and fitting | Accepted | Minimizer Category Consolidation | Collapses the seven Bayesian categories into one owner-level switchable `minimizer` category with HDF5 sidecar. | [`minimizer-category-consolidation.md`](accepted/minimizer-category-consolidation.md) | | Analysis and fitting | Accepted | Minimizer Input/Output Split | Keeps `analysis.minimizer` input-only and moves scalar fit outputs to paired `analysis.fit_result` classes. | [`minimizer-input-output-split.md`](accepted/minimizer-input-output-split.md) | | Analysis and fitting | Superseded | Parameter-Level Posterior Projection | Superseded by minimizer-category consolidation; kept as historical context for `parameter.posterior`. | [`parameter-posterior-summary.md`](suggestions/parameter-posterior-summary.md) | @@ -55,6 +55,7 @@ folders. | Quality | Accepted | Lint Rule Scope and Test-File Exceptions | Records the standing tests/\*\* PLR/N812 ignores and CIF-aligned `id`/`type` builtin exception from the lint audit. | [`lint-rule-exceptions.md`](accepted/lint-rule-exceptions.md) | | Quality | Accepted | Test Strategy | Defines layered unit, functional, integration, script, and notebook testing. | [`test-strategy.md`](accepted/test-strategy.md) | | Quality | Accepted | Test Suite and Validation Strategy | Strict test layers, cost tiers, coverage/codecov policy, cross-engine verification docs, and a nightly validation harness. | [`test-suite-and-validation.md`](accepted/test-suite-and-validation.md) | +| Quality | Suggestion | Notebook-Owned Verification Regression Gating | Replaces the external `ci_skip.txt` list with a single in-notebook `regression=False` flag (plus a cell tag for pre-flag crashes). | [`verification-regression-flag.md`](suggestions/verification-regression-flag.md) | | Structure model | Accepted | Type-Neutral ADP Parameters | Keeps ADP parameter object identities stable across B/U and iso/ani switches. | [`type-neutral-adp-parameters.md`](accepted/type-neutral-adp-parameters.md) | | Structure model | Accepted | Automatic Wyckoff Position Detection | Detects Wyckoff letter, multiplicity, and site symmetry from space group and coordinates; calculators consume them. | [`wyckoff-letter-detection.md`](accepted/wyckoff-letter-detection.md) | | Structure model | Accepted | Complete Space-Group Reference Database | One-time build of a complete space_groups.json.gz (all 230 groups) from cctbx, verified against multiple sources. | [`space-group-database.md`](accepted/space-group-database.md) | diff --git a/docs/dev/adrs/suggestions/dataset-driven-fit-modes_reply-1.md b/docs/dev/adrs/suggestions/dataset-driven-fit-modes_reply-1.md deleted file mode 100644 index af5464cab..000000000 --- a/docs/dev/adrs/suggestions/dataset-driven-fit-modes_reply-1.md +++ /dev/null @@ -1,80 +0,0 @@ -# Reply 1: Dataset-Driven Fit Modes and Sequential Redefinition - -## P1 — Resolve the `sequential_fit` category before redefining `sequential` - -**Verdict:** Agree. - -The redefinition and the deferred folder sweep genuinely cannot share -the `sequential_fit` / `sequential_fit_extract` surface — and the two -input models cannot live under one count-gated mode, since the folder -sweep needs _exactly one_ template while loaded-dataset `sequential` -needs _≥2_ (opposite preconditions). The ADR now commits to a first-step -contract instead of leaving the surface ambiguous. - -**Action taken:** - -- Added **Decision 5a — "The folder-of-files sweep is parked for the - first step"**: loaded-dataset `sequential` uses neither folder - category; the folder-sweep path and its categories are retained in - code but parked (not selectable, hidden from display, omitted from CIF - so stale folder settings cannot survive under a mode that no longer - reads them). Restoration is the deferred input-source ADR. -- Flagged the parking as a **temporary removal of a tested workflow - requiring owner sign-off** (per §Change Discipline), with the - no-removal alternative spelled out. -- Added **Alternatives Considered — "Split the folder sweep into a - `scan` mode now"** as the documented fallback (keeps the feature - continuously available at the cost of adding `scan` to the one-dataset - availability row). -- Rewrote the **Open Questions** folder-sweep item to separate the now- - settled first-step contract from the still-open long-term home, and - updated the **Trade-offs** bullet accordingly. - -Affected sections: `## Decision` (5a), `## Alternatives Considered`, -`## Open Questions` (long-term home of the folder sweep), -`## Consequences` (Trade-offs). - -## P1 — Define replay semantics so plotting does not corrupt live state - -**Verdict:** Agree. - -Applying a stored per-point parameter set to the single shared structure -is itself a mutation, so the issue-85 fix needed an explicit rule to -avoid replacing one correctness bug with another (plotting point A -changing point B / `save` / `undo`). - -**Action taken:** - -- Added **Decision 4a — "Replay contract"**: the live model is - authoritative and never perturbed by viewing; per-point recomputation - is a scoped, self-restoring apply (captures live values, computes, - restores on exit including on error, internal-only); plotting prefers - reading persisted/cached per-point calculated arrays over recomputing; - and `save` / `undo` operate on the live model only, independent of - whichever point was last plotted. - -Affected section: `## Decision` (4a). - -## P1 — Specify what makes loaded experiments a valid sequential series - -**Verdict:** Agree. - -"Two loaded experiments" is not the same as "a fittable series," and the -preconditions belong in the decision rather than the plan. - -**Action taken:** - -- Extended **Decision 3** to make the carry-forward scope precise - (shared structure parameters carry forward; per-experiment parameters - are fit independently per point; deterministic project-order with an - opt-in reverse). -- Added **Decision 3a — "Preconditions that make loaded experiments a - valid series"**: `show_supported()` is gated by experiment **count - only** (predictable, explainable); the richer rules — measured data on - every experiment, a shared structure model across the series, each - experiment individually fittable (valid calculator + non-empty free - set), stable order — are enforced at **fit time** with errors that - name the offending experiment and rule. Experiment/calculator types - may differ across points. - -Affected sections: `## Decision` (3, 3a). diff --git a/docs/dev/adrs/suggestions/dataset-driven-fit-modes_reply-2.md b/docs/dev/adrs/suggestions/dataset-driven-fit-modes_reply-2.md deleted file mode 100644 index a8baeb66d..000000000 --- a/docs/dev/adrs/suggestions/dataset-driven-fit-modes_reply-2.md +++ /dev/null @@ -1,25 +0,0 @@ -# Reply 2: Dataset-Driven Fit Modes and Sequential Redefinition - -## P1 — Give loaded-dataset `reverse` a public home or defer it - -**Verdict:** Agree. - -The Reply 1 edits introduced an "opt-in reverse" for loaded-dataset -`sequential` while Decision 5a simultaneously parked `sequential_fit` -(where `reverse` lives) — hidden and omitted from CIF. That left the -reverse option with no public or persisted home, and the ADR adds no -owner-level setter, so the plan would have had to invent unapproved API -surface. Inventing a loaded-series config category just to host one -ordering flag is not worth it in the first step. - -**Action taken:** - -- Edited **Decision 3** to drop configurable reverse/custom ordering - from the first step. The series order is now simply the deterministic - project collection order; the ADR explicitly states that the existing - `reverse` flag lives on the parked `sequential_fit` category, that no - new owner-level setter or category is added for it, and that ordering - control is **deferred to the input-source follow-up**. Until then, - users order the series by the order in which they add experiments. - -Affected section: `## Decision` (3). diff --git a/docs/dev/adrs/suggestions/dataset-driven-fit-modes_review-1.md b/docs/dev/adrs/suggestions/dataset-driven-fit-modes_review-1.md deleted file mode 100644 index a1d1635e5..000000000 --- a/docs/dev/adrs/suggestions/dataset-driven-fit-modes_review-1.md +++ /dev/null @@ -1,78 +0,0 @@ -# Review 1: Dataset-Driven Fit Modes and Sequential Redefinition - -## Findings - -### P1. Resolve the `sequential_fit` category before redefining `sequential` - -The ADR redefines `sequential` to fit the already loaded experiments in -turn, but it also keeps `sequential_fit` and `sequential_fit_extract` as -mode-specific categories that remain visible and serializable when -sequential mode is relevant (`dataset-driven-fit-modes.md` lines -124-133). Those categories are not neutral today: the accepted fit-mode -ADR and persistence mapping define them around the folder sweep -(`data_dir`, `file_pattern`, `max_workers`, `chunk_size`, `reverse`, and -extraction rules), and the current code consumes those fields directly -when running sequential fits. At the same time, this ADR defers the -folder-sweep fate out of the first implementation -(`dataset-driven-fit-modes.md` lines 268-270). - -That leaves the first implementation with two incompatible meanings for -the same public and persisted surface. If `sequential_fit` remains -visible for loaded-dataset sequential fits, users will see folder-input -fields that are irrelevant to the loaded experiments. If it is omitted, -Decision 5 is wrong. If it is serialized, old folder settings can -survive under a mode that no longer reads them. Please make the ADR -choose the first-step contract explicitly: either retire or hide the -folder-specific categories until a later input-source design, split the -folder sweep into a separate mode/workflow now, or define -`sequential_fit` as an input-source category whose loaded-dataset case -has a clear no-folder representation. - -### P1. Define replay semantics so plotting does not corrupt live state - -Decision 4 says issue 85 is fixed by applying an experiment's stored -parameter set to the shared structure before recomputing or plotting -that experiment (`dataset-driven-fit-modes.md` lines 114-122). Because -the project has one live structure object shared by all loaded -experiments, applying an earlier point's fitted parameters is itself a -state mutation. The ADR does not say whether that mutation is temporary, -whether the previous live state is restored after plotting, whether the -"current" project model becomes whichever point was plotted last, or how -this interacts with save and undo. - -Without that rule, the proposed fix can replace issue 85 with a -different correctness bug: plotting experiment A can silently change the -structure values used for experiment B, subsequent calculations, and the -saved project. Please specify the replay contract in the ADR. For -example, require plot/calculation replay to apply stored per-dataset -parameters in a scoped temporary context that restores the pre-plot live -model, or choose a non-mutating storage strategy such as persisted -calculated arrays for each fitted dataset. - -### P1. Specify what makes loaded experiments a valid sequential series - -The availability table makes `sequential` available for any project with -at least two loaded experiments (`dataset-driven-fit-modes.md` lines -71-92), while fit-time validation only says "`sequential` with no -fittable series" should raise a clear error -(`dataset-driven-fit- modes.md` lines 143-152). The ADR never defines -the boundary between "two loaded experiments" and "a fittable series." -That matters because the scientific intent is "the same sample, fit per -point" (lines 29-30), but a project can contain multiple arbitrary -experiments, possibly with different experiment types, calculators, -measured-data state, structures, free-parameter sets, or intended -ordering. - -Please make the sequential preconditions part of the decision rather -than leaving them to the plan. At minimum, the ADR should say whether -loaded-dataset sequential requires exactly one structure, measured data -on every experiment, compatible experiment/calculator types, a stable -experiment order, and a consistent carried-forward free-parameter set. -Those rules also need to feed `show_supported()` or the fit-time error -message so scientists are not offered a workflow that cannot be -explained from the loaded project state. - -## Checks - -Static review only. Per `AGENTS.md`, I did not run tests, lint, -formatters, build commands, or any `pixi` command. diff --git a/docs/dev/adrs/suggestions/dataset-driven-fit-modes_review-2.md b/docs/dev/adrs/suggestions/dataset-driven-fit-modes_review-2.md deleted file mode 100644 index 4b30d09ca..000000000 --- a/docs/dev/adrs/suggestions/dataset-driven-fit-modes_review-2.md +++ /dev/null @@ -1,29 +0,0 @@ -# Review 2: Dataset-Driven Fit Modes and Sequential Redefinition - -## Findings - -### P1. Give loaded-dataset `reverse` a public home or defer it - -The revised ADR now says loaded-dataset `sequential` has deterministic -project collection order "with an opt-in reverse (mirroring the existing -`reverse` flag)" (`dataset-driven-fit-modes.md` lines 121-123). But -Decision 5a also says loaded-dataset `sequential` uses neither -`sequential_fit` nor `sequential_fit_extract`, and the folder sweep -categories are parked, hidden, and omitted from CIF -(`dataset-driven-fit-modes.md` lines 215-221). Today `reverse` lives on -`analysis.sequential_fit`, so after parking that category there is no -public or persisted surface left for this opt-in reverse setting. The -ADR also preserves the category-owned selector contract and does not add -new owner-level setters. - -Please either remove reverse ordering from the first-step loaded-dataset -contract, or define its new public/persisted home. For example, the ADR -could introduce a small loaded-series configuration category with only -settings that apply to loaded datasets, or explicitly defer reverse -until the folder/input-source follow-up. As written, the plan would have -to invent API surface that the ADR has not approved. - -## Checks - -Static review only. Per `AGENTS.md`, I did not run tests, lint, -formatters, build commands, or any `pixi` command. diff --git a/docs/dev/adrs/suggestions/dataset-driven-fit-modes_review-3.md b/docs/dev/adrs/suggestions/dataset-driven-fit-modes_review-3.md deleted file mode 100644 index cba70373d..000000000 --- a/docs/dev/adrs/suggestions/dataset-driven-fit-modes_review-3.md +++ /dev/null @@ -1,11 +0,0 @@ -# Review 3: Dataset-Driven Fit Modes and Sequential Redefinition - -**No findings. Ready to commit.** - -Review 2's remaining API-surface issue is addressed: loaded-dataset -`sequential` now uses deterministic project collection order only, and -reverse/custom ordering is explicitly deferred to the input-source -follow-up rather than relying on the parked `sequential_fit` category. - -Static review only. Per `AGENTS.md`, I did not run tests, lint, -formatters, build commands, or any `pixi` command. diff --git a/docs/dev/adrs/suggestions/verification-regression-flag.md b/docs/dev/adrs/suggestions/verification-regression-flag.md new file mode 100644 index 000000000..b20ea1ab8 --- /dev/null +++ b/docs/dev/adrs/suggestions/verification-regression-flag.md @@ -0,0 +1,262 @@ +# ADR: Notebook-Owned Verification Regression Gating + +## Status + +Proposed. + +## Date + +2026-06-17 + +## Group + +Quality. + +## Context + +The cross-engine **Verification** pages (established by +[`test-suite-and-validation`](accepted/test-suite-and-validation.md) §6, +and built as `.py` sources per +[`notebook-generation`](accepted/notebook-generation.md)) serve two +roles at once: each page is a **regression test** (its calculated +pattern is checked against a frozen FullProf reference) and a +**published HTML doc** (it renders a comparison and an agreement table +for scientists). + +Today that dual role is controlled by **two separate mechanisms**, and a +single page can need both: + +1. **`verify.assert_patterns_agree(..., raise_on_failure=…)`** — the + per-page agreement check. With the default `raise_on_failure=True` it + raises `AssertionError` when any metric is out of tolerance, which is + what makes the page a regression test. With `raise_on_failure=False` + it renders the table but does not raise. +2. **`docs/docs/verification/ci_skip.txt`** — an external list of + notebook stems, consumed by two runners: + - `docs/docs/conftest.py` marks listed notebooks **`xfail` + (`strict=False`)** for `notebook-tests`/nbmake — they still execute + and render, but a failure is tolerated; + - `tools/test_scripts.py` **`pytest.skip`s** them in `script-tests` + (the fast runner that executes each page's `.py` as a subprocess + and fails if it raises). + +This split is confusing: deciding "this page is known-bad, do not gate +regression on it, but still publish it" requires editing a flag inside +the notebook **and** adding the stem (with a `# reason`) to a separate +file. The reason a page is exempt lives in `ci_skip.txt`, invisible to +readers of the published page. The PbSO₄ Bérar–Baldinozzi page +(issue 166) is the current live example: it carries both +`raise_on_failure=False` **and** a `ci_skip.txt` entry. + +The project owner wants a **single source of truth, set at the end of +the notebook**, with this intent: + +> If a verification page is good, it is a regression test and a doc +> page. If I know its regression is bad because a feature is not +> implemented yet, I mark that page as "not regression-gated" but still +> keep it as an HTML doc. + +This ADR makes the notebook itself own that decision and removes +`ci_skip.txt`, within one boundary that is intrinsic to "decide from +inside the notebook". + +## Decision + +### 1. One in-notebook flag is the single source of truth + +The agreement check gains an intention-revealing, **self-documenting** +parameter that subsumes `raise_on_failure`: + +```python +verify.assert_patterns_agree( + [('cryspy vs FullProf', reference, candidate)], + regression=False, # known-bad: render, don't gate + reason='FCJ S_L/D_L not implemented in cryspy yet (issue 166)', +) +``` + +- `regression=True` (**default**) → assert on out-of-tolerance; the page + is a **regression test and a doc page**. +- `regression=False` → render the comparison table, **do not raise**, + and surface `reason` in the rendered output. `reason` is **required** + when `regression=False` so every exemption is explained on the page + itself. + +`raise_on_failure` is renamed to `regression` (its inverse). Beta, no +shims: existing pages are migrated, not aliased (see Compatibility). + +### 2. Delete `ci_skip.txt` and its two consumers + +For every page that **runs to completion** (the normal case — an +unimplemented feature means the engine ignores the unsupported parameter +and produces a different curve, it does not crash), the in-notebook flag +fully determines behaviour: + +- `notebook-tests`/nbmake: the page executes, the `regression=False` + check does not raise, the notebook **passes and renders** to HTML. +- `script-tests`: the `.py` subprocess completes without raising, so the + page **passes**. + +No external list is needed. `docs/docs/verification/ci_skip.txt`, the +`xfail` logic in `docs/docs/conftest.py`, and the `pytest.skip` block in +`tools/test_scripts.py` are **removed**. + +### 3. Boundary: pages that error _before_ the flag use a cell tag + +An end-of-notebook flag can only govern a notebook that **reaches the +end**. A page that raises mid-execution — e.g. one needing an +**unreleased** calculator API that errors rather than no-ops — never +reaches `assert_patterns_agree`, and `script-tests` fails on a raising +subprocess regardless of any flag. + +For those (temporary, "waiting on a dependency release") cases the +**failing cell is tagged `raises-exception`** in the source `.py` +(jupytext cell metadata, honoured by nbmake and reproduced into the +generated notebook). This keeps control **inside the notebook** — still +one place, still no external list — and is paired with a short markdown +note stating why. A page in this state is, by definition, not currently +runnable as a regression test; tagging documents that explicitly. + +### 4. `reason` replaces the `ci_skip.txt` comment as on-page provenance + +The `# reason` strings currently buried in `ci_skip.txt` move into the +`reason=` argument (and the `raises-exception` companion note), so the +explanation is **visible in the published page** and travels with the +code that sets it. + +## Consequences + +### Positive + +- **One place, in the notebook.** Whether a page gates regression is + decided where the page is authored, at the end, next to the data it + checks — matching the owner's mental model. +- **Provenance is public.** The exemption reason renders on the page + instead of hiding in a list file. +- **Approved pages are unchanged and still regression-checked** in both + runners (default `regression=True`); the dual "test + doc" role is + preserved. +- **All pages are still generated and rendered to HTML**, exactly as + today (this never depended on `ci_skip.txt`). +- Three coupled artifacts (`ci_skip.txt`, the conftest filter, the + script-test skip) collapse into one library parameter. + +### Trade-offs + +- **Known-bad pages now run in `script-tests`** instead of being + skipped, so the fast runner does the refinement fit for those pages + too and gets somewhat slower. Acceptable for the current page count; + if it ever matters, speed is a _separate_ concern from regression + intent and must not reintroduce a second semantic list (see Open + Questions). +- **Loss of the `xfail`/`xpass` signal.** Today a known-bad page is an + `xfail`; if it unexpectedly starts agreeing it shows as `xpass`. Under + this ADR a `regression=False` page simply passes, so "this known-bad + page now actually matches — time to re-enable the gate" is no longer + surfaced automatically. Mitigation in Open Questions (optional + `xpass`-style warning when a `regression=False` page is within + tolerance). +- The `raises-exception` boundary (Decision 3) is less automatic than a + list entry and must be applied per failing cell. + +### Compatibility + +- Beta, no shims. Every verification `.py` is migrated: + `raise_on_failure=False` → `regression=False, reason=…`; the implicit + default stays "regression-gated". `ci_skip.txt` entries become + `regression=False` flags (runnable pages) or `raises-exception` tags + (pre-flag crashes). +- `notebook-generation` is unaffected: sources remain the `.py` files; + the `raises-exception` tag is expressed in the `.py` and regenerated + into the notebook via `notebook-prepare`. +- Revises + [`test-suite-and-validation`](accepted/test-suite-and-validation.md) + §6 (the `script-tests` "skip via `ci_skip.txt`" detail) and the + `ci_skip.txt`/conftest wiring under + [`documentation-ci-build`](accepted/documentation-ci-build.md). Those + ADRs are updated when this is implemented. + +## Alternatives Considered + +### Keep both mechanisms (status quo) + +Two places to edit, reason invisible on the page. Rejected per the +owner's explicit request for a single in-notebook control. + +### Make `regression=False` (or `ci_skip`) auto-derived, keep the list + +Drive `xfail`/skip from the list only and drop the per-page flag. +Rejected: the decision then lives outside the notebook, the opposite of +what is wanted, and the reason stays hidden. + +### Notebook-level metadata instead of an end-of-notebook call + +Encode "not regression-gated" in notebook-level metadata read at pytest +**collection** time (so even a crashing page could be `xfail`ed without +a separate file). Rejected as the primary mechanism: it is not "at the +end of the notebook like `raise_on_failure`", it is invisible in the +rendered page, and it splits the control surface between a metadata key +and the agreement call. The narrower per-cell `raises-exception` tag +(Decision 3) is preferred for the rare crash case. + +### A `verify.show_patterns(...)` function distinct from + +`assert_patterns_agree(...)` + +Two functions — one that asserts, one that only renders — instead of one +function with a flag. Rejected: it still needs the `raises-exception` +boundary for crashes, doubles the surface, and a boolean on the existing +call is closer to the requested "like `raise_on_failure=False`" shape. + +## Open Questions + +- **Naming.** `regression=False` vs keeping `raise_on_failure` vs + `gate_regression=`/`expected=`. Proposed: `regression` with a required + `reason`. +- **Fast-runner speed.** Should `script-tests` still avoid the + refinement fit for `regression=False` pages? Any "skip for speed" must + be derivable from the in-notebook flag (e.g. the runner statically + detects `regression=False` in the `.py`) so it does **not** become a + second semantic list. Needs the page audit below. +- **Re-enable signal.** Should a `regression=False` page that is + actually within tolerance emit a visible warning (an `xpass` analogue) + so a fixed page gets re-gated? Proposed: yes, a non-failing warning. +- **Audit dependency.** How many current `ci_skip.txt` pages merely + disagree (Decision 2) versus crash before the flag (Decision 3)? The + implementation plan must classify each of the eight pages first. + +## Deferred Work + +- The page-by-page audit (disagree-cleanly vs crash) and the migration + of all verification `.py` files belong to the implementation plan, not + this ADR. +- Any static "skip slow `regression=False` pages in `script-tests`" + optimisation is deferred until the audit shows it is needed. + +### Adjacent ADR needed — show software/engine versions on verification pages + +Separate from regression gating, the verification pages should record +the **provenance** of every comparison: which versions of +EasyDiffraction, the calculator engine, and FullProf produced the +curves. This is only partly done today — some pages label the reference +via +`FULLPROF_LABEL = verify.fullprof_label(FULLPROF_PROJECT_DIR, FULLPROF_SUM_FILE)` +(the FullProf version parsed from the `.sum`), but **not all pages do**, +and the candidate is still a bare `edi-cryspy` / `edi-crysfml` with no +version. + +The intended labelling is, for example: + +- reference → `FullProf vX.YZ` (already via `fullprof_label`, applied + consistently); +- candidate → `edi vX.Y.Z (cryspy vX.Y.Z)`, and likewise + `edi vX.Y.Z (crysfml vX.Y.Z)`, instead of the current `edi-cryspy` / + `edi-crysfml`. + +This needs its **own ADR** (or an extension of the +verification-framework decision in +[`test-suite-and-validation`](accepted/test-suite-and-validation.md) +§6): a `verify` helper that builds the candidate label from the +EasyDiffraction package version and the **active calculator engine's** +version, applied on every page next to `fullprof_label`. Recorded here +only as a pointer; it is out of scope for this regression-gating ADR. diff --git a/docs/dev/issues/closed/retain-per-experiment-fitted-parameters-for-plotting.md b/docs/dev/issues/closed/retain-per-experiment-fitted-parameters-for-plotting.md new file mode 100644 index 000000000..d97dcdc32 --- /dev/null +++ b/docs/dev/issues/closed/retain-per-experiment-fitted-parameters-for-plotting.md @@ -0,0 +1,27 @@ +# 85. Retain Per-Experiment Fitted Parameters for Plotting + +**Type:** Correctness / UX + +**Status:** Closed. + +In `single` fit mode, only the last experiment's fit results were +retained because the shared structure parameters were overwritten on +each iteration of the multi-experiment loop (`single`-with-N), so +earlier experiments could not be plotted correctly after fitting. + +**Resolution:** the bug is removed by construction rather than patched +with per-experiment snapshots. Under the +[`dataset-driven-fit-modes`](../../adrs/accepted/dataset-driven-fit-modes.md) +ADR, `single` mode is restricted to **exactly one** loaded experiment +(fit-mode availability is now driven by the loaded-experiment count: +`single`/`sequential` for one experiment, `joint` for two or more). With +no multi-experiment `single` loop, there is no shared structure to +overwrite and no earlier experiment to mis-plot. + +The legacy in-memory `_parameter_snapshots` store and the +`plot_param_series_from_snapshots` fallback that existed only for the +`single`-with-N path were removed as dead code. Parameter-evolution +plotting across a series is served by `sequential` fitting via +`analysis/results.csv` (`plot_param_series` / `plot_all_param_series`). + +**Related:** issue 78 (resolved). diff --git a/docs/dev/issues/index.md b/docs/dev/issues/index.md index 764aeaf16..59e09d8ec 100644 --- a/docs/dev/issues/index.md +++ b/docs/dev/issues/index.md @@ -14,7 +14,6 @@ individual issue files** — not here. | # | Issue | Priority | Type | | --- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------------------- | ----------------------------------------------------------- | -| 85 | [Retain Per-Experiment Fitted Parameters for Plotting](open/highest_retain-per-experiment-fitted-parameters-for-plotting.md) | `[priority] highest` | Correctness / UX | | 119 | [Model Sample Absorption (Debye–Scherrer, μR)](open/highest_model-sample-absorption-debye-scherrer-r.md) | `[priority] highest` | Physics / Engine feature | | 130 | [cryspy Diverges on TOF Jorgensen–Von Dreele Lorentzian](open/highest_cryspy-diverges-on-tof-jorgensen-von-dreele-lorentzian.md) | `[priority] highest` | Correctness | | 134 | [Investigate ed-crysfml TOF Jorgensen Profile Discrepancy](open/highest_investigate-ed-crysfml-tof-jorgensen-profile-discrepancy.md) | `[priority] highest` | Correctness | @@ -168,6 +167,7 @@ individual issue files** — not here. | 77 | [Add Help Methods to Public Discovery Facades](closed/add-help-methods-to-public-discovery-facades.md) | | 78 | [Add `SEQUENTIAL` to `FitModeEnum` and Show Methods to Analysis](closed/add-sequential-to-fitmodeenum-and-show-methods-to-analysis.md) | | 84 | [Serialise `None` as `.` in CIF Output](closed/serialise-none-as-in-cif-output.md) | +| 85 | [Retain Per-Experiment Fitted Parameters for Plotting](closed/retain-per-experiment-fitted-parameters-for-plotting.md) | | 100 | [Collapse Duplicate Predictive-Cache-Key Helpers](closed/collapse-duplicate-predictive-cache-key-helpers.md) | | 101 | [Remove Dead Branch in `_fit_state_categories`](closed/remove-dead-branch-in-fit-state-categories.md) | | 103 | [Make `_sync_engine_from_minimizer_category` Skip-Keys Declarative](closed/make-sync-engine-from-minimizer-category-skip-keys-declarative.md) | diff --git a/docs/dev/issues/open/highest_retain-per-experiment-fitted-parameters-for-plotting.md b/docs/dev/issues/open/highest_retain-per-experiment-fitted-parameters-for-plotting.md deleted file mode 100644 index 79cbd9e27..000000000 --- a/docs/dev/issues/open/highest_retain-per-experiment-fitted-parameters-for-plotting.md +++ /dev/null @@ -1,21 +0,0 @@ -# 85. Retain Per-Experiment Fitted Parameters for Plotting - -**Priority:** `[priority] highest` - -**Type:** Correctness / UX - -In `single` fit mode, only the last experiment's fit results are -retained because structure parameters are overwritten on each iteration. -This means earlier experiments cannot be plotted correctly after -fitting. - -**Fix:** after fitting each experiment, store a snapshot of its fitted -parameters (both structure and experiment) so that any experiment can be -re-plotted or inspected later. Also clarify: does `fit_results` need to -keep the last mutable parameter set after adding a snapshot? - -**Depends on:** nothing (issue 78 resolved). - -**Recommended-priority note:** In `single` mode only the last -experiment's results survive, so earlier experiments plot incorrectly -after fitting. **Tier 1 (do first).** diff --git a/docs/dev/issues/open/low_draw-adp-ellipsoids-for-beta-tensor-atoms.md b/docs/dev/issues/open/low_draw-adp-ellipsoids-for-beta-tensor-atoms.md index 2c462a55a..66279856c 100644 --- a/docs/dev/issues/open/low_draw-adp-ellipsoids-for-beta-tensor-atoms.md +++ b/docs/dev/issues/open/low_draw-adp-ellipsoids-for-beta-tensor-atoms.md @@ -13,5 +13,4 @@ conversion in the renderer (using the reciprocal cell), analogous to the existing B→U step. The model-layer β↔U conversion already exists (`AtomSite._convert_adp_values_beta`) and could be reused. -**Depends on:** the β-tensor ADP support -([adp-beta-tensor plan](docs/dev/plans/adp-beta-tensor.md)). +**Depends on:** the β-tensor ADP support (shipped in #199). diff --git a/docs/dev/issues/open/medium_more-intuitive-adp-creation-api-type-aware-kwargs.md b/docs/dev/issues/open/medium_more-intuitive-adp-creation-api-type-aware-kwargs.md index 9c6e0a35a..801939901 100644 --- a/docs/dev/issues/open/medium_more-intuitive-adp-creation-api-type-aware-kwargs.md +++ b/docs/dev/issues/open/medium_more-intuitive-adp-creation-api-type-aware-kwargs.md @@ -17,5 +17,4 @@ ADR — which deliberately chose type-neutral storage to keep parameter identity stable across switches — so it needs its own ADR + plan and is independent of the β-tensor work that surfaced it. -**Depends on:** the β-tensor ADP support -([adp-beta-tensor plan](docs/dev/plans/adp-beta-tensor.md)). +**Depends on:** the β-tensor ADP support (shipped in #199). diff --git a/docs/dev/plans/adp-beta-tensor.md b/docs/dev/plans/adp-beta-tensor.md deleted file mode 100644 index 003763450..000000000 --- a/docs/dev/plans/adp-beta-tensor.md +++ /dev/null @@ -1,461 +0,0 @@ -# Plan: β-tensor (dimensionless) anisotropic ADP support - -Follows the conventions in [`AGENTS.md`](../../../AGENTS.md). No -deliberate exceptions to those instructions are taken by this plan. - -## Status checklist - -See [Implementation steps (Phase 1)](#implementation-steps-phase-1) for -the per-step checklist. High-level: - -- [x] Phase 1 — Implementation (code + docs + ADR) -- [x] Phase 1 review gate -- [x] Phase 2 — Verification (tests + `pixi` gate) - -## ADR - -This feature **extends an existing accepted ADR**, -[`type-neutral-adp-parameters.md`](../adrs/accepted/type-neutral-adp-parameters.md). -No new ADR is required. The ADR already mandates type-neutral parameter -names (`adp_iso`, `adp_11`…`adp_23`) interpreted through -`atom_site.adp_type`; β is a third anisotropic interpretation of the -same `adp_11`…`adp_23` objects. The ADR is updated by this work with an -**Extension** section recording the β decision, the cell-dependent -conversion, and the dimensionless-units caveat (Phase 1, step P1.1). - -Related accepted ADRs to keep consistent (read before touching their -surfaces): -[`enum-backed-closed-values.md`](../adrs/accepted/enum-backed-closed-values.md) -(the `AdpTypeEnum` extension), -[`iucr-cif-tag-alignment.md`](../adrs/accepted/iucr-cif-tag-alignment.md) -and -[`python-cif-category-correspondence.md`](../adrs/accepted/python-cif-category-correspondence.md) -(the new `_atom_site_aniso.beta_*` CIF tags), -[`guarded-public-properties.md`](../adrs/accepted/guarded-public-properties.md) -(no new public setters added). - -## Branch & PR - -- Branch: `adp-beta-tensor` (flat slug off `develop`, no `feature/` - prefix). PR targets `develop`. Do not push until asked. - -## Background — current state - -Atomic displacement parameters today support four `AdpTypeEnum` members: -`Biso`, `Uiso`, `Bani`, `Uani` -(`src/easydiffraction/datablocks/structure/categories/atom_sites/enums.py`). -Per the type-neutral ADR the public values live in `atom_site.adp_iso` -(isotropic) and the always-present sibling collection `atom_site_aniso` -(`adp_11`…`adp_23`), with `atom_site.adp_type` selecting B-vs-U and -iso-vs-ani. - -The **dimensionless β tensor** — the third standard anisotropic ADP -convention, used by FullProf, SHELX-era data, and cryspy internally — is -**not** accepted on input or emitted on output. Today: - -- `AdpTypeEnum` has no `beta` member. -- The B↔U type-switch conversion in `AtomSite._convert_adp_values` is a - pure scalar (`B = 8π²U`); it has no cell access and cannot do the β - transform. -- The aniso `adp_ij` parameters declare `units='angstrom_squared'` and - display `Ų`; β components are dimensionless. -- The aniso **off-diagonal** validators (`adp_12`/`adp_13`/`adp_23`) - already use an unrestricted `RangeValidator()` (negatives allowed); - only the **diagonal** components (`adp_11`/`adp_22`/`adp_33`) carry - `RangeValidator(ge=0.0, le=10.0)`. β diagonals are ~1e-3 (well within - `[0, 10]`) and β off-diagonals (small, routinely negative) are already - accepted, so **no validator change is needed** for β (resolved Q4). -- `io/cif/iucr_writer.py::_adp_family()` returns `'B'` or `'U'` from the - first letter of `adp_type`; `'beta'` starts with `'b'` and would be - mis-classified as `'B'`. The writer is otherwise already - family-parameterized (`_atom_site_aniso.{family}_11`), so a `beta` - family slots in naturally. -- The cryspy backend already converts B/U→β via - `CryspyCalculator._update_aniso_beta` using - `β_ij = 2π²·U_ij·a*_i·a*_j`; native β input can pass through with no - conversion. -- The crysfml backend (`_convert_structure_to_dict`) sends **only** - `_B_iso_or_equiv` and `_adp_type` — it does **not** currently send the - anisotropic tensor at all. Anisotropic support on the crysfml side is - therefore a pre-existing gap, not introduced here; β on crysfml is out - of scope for this plan (see Open questions Q3). - -Conversion math (reciprocal lengths `a* b* c*`): - -``` -β_11 = 2π²·a*²·U_11 U_11 = β_11 / (2π²·a*²) -β_22 = 2π²·b*²·U_22 U_22 = β_22 / (2π²·b*²) -β_33 = 2π²·c*²·U_33 U_33 = β_33 / (2π²·c*²) -β_12 = 2π²·a*·b*·U_12 U_12 = β_12 / (2π²·a*·b*) -β_13 = 2π²·a*·c*·U_13 U_13 = β_13 / (2π²·a*·c*) -β_23 = 2π²·b*·c*·U_23 U_23 = β_23 / (2π²·b*·c*) -``` - -with `B = 8π²U`. A reciprocal-length computation already exists -**privately** in `display/structure/builder.py` (`_reciprocal_lengths`, -used for ADP-ellipsoid rendering), but it is not reusable from the model -layer where the type-switch conversion runs. This work adds a shared -`crystallography.reciprocal_cell_lengths` (the model-layer source of -truth) and routes the existing builder helper through it (consolidation, -step P1.9), rather than leaving two copies of the same math. - -## Decisions - -1. **β is a first-class stored anisotropic ADP type**, symmetric with - `Bani`/`Uani`. Add `AdpTypeEnum.BETA = 'beta'`. When - `adp_type == 'beta'`, the `adp_11`…`adp_23` objects hold the - dimensionless β components directly. Rationale: round-trips β CIFs, - matches the existing first-class Bani/Uani pattern, and feeds cryspy - (which is β-native) without a conversion. (Alternative — β as an - I/O-only encoding that converts to Uani on read — is recorded under - Open questions Q1 for the user to choose; this plan proceeds on the - first-class decision unless overridden.) - -2. **β has no isotropic form.** `'beta'` always implies anisotropic. - Switching _to_ `beta` from an isotropic type seeds the tensor like - the existing iso→ani path, then converts U→β; switching _from_ `beta` - to an isotropic type converts β→U and collapses the diagonal, reusing - the existing collapse path. - -3. **Cell-dependent conversion via a new easydiffraction geometry - helper.** Add a reciprocal-length helper (`a* b* c*` from - `a b c α β γ`) to - `src/easydiffraction/crystallography/crystallography.py` — the - existing crystallographic-math module (Wyckoff positions, space-group - symmetry constraints), which already hosts this kind of domain - geometry. (`core/` is the wrong home: it must stay domain-free; - `crystallography/` is the established place for crystallographic - math.) β↔U/B conversion routes through the parent structure's `cell`; - when no parent cell is reachable (atom constructed in isolation) the - type switch raises a clear error rather than silently producing wrong - numbers — this is a boundary-input edge case per §Project Context. Do - **not** depend on cryspy's reciprocal helper for the ed-side - conversion, so the model-layer type switch stays independent of any - calculator backend. The same shared helper replaces the duplicate - private `_reciprocal_lengths` in `display/structure/builder.py` (step - P1.9). - -4. **Type-aware display units for aniso components.** When the owning - `adp_type` is `beta`, the `adp_ij` display shows no `Ų` unit (β is - dimensionless); B/U families keep `Ų`. Implement as a display-layer - lookup on `adp_type`, not by mutating the Parameter's stored unit - metadata (which stays a single declared unit per the value model). - -5. **CIF tags.** Register `_atom_site_aniso.beta_11`…`beta_23` on the - aniso `adp_ij` `CifHandler` name lists (alongside the existing - `B_ij`/`U_ij`). Extend `_adp_family()` to return `'beta'` for - `adp_type == 'beta'` (check the full value, not the first letter), so - the writer emits a `beta` loop and groups β atoms separately. On - read, a CIF carrying `_atom_site_aniso.beta_*` sets - `adp_type = 'beta'` and stores the β values verbatim. **Both** CIF - writers need the `beta` family: the report writer - (`iucr_writer.py::_adp_family`, step P1.7) **and** the project - serializer (`io/cif/serialize.py` — new `_ADP_FAMILY_BETA`, a `beta` - branch in `_adp_family_from_type`, and a `beta` entry in the - `_group_items_by_adp_family` grouping dict, step P1.8). The - serializer is what makes project save/load round-trip β; without it β - atoms would silently serialize as a B loop. - -6. **cryspy backend (two β paths).** cryspy receives β via **two** - complementary routes, both implemented: - - **Dict passthrough (refinement steps).** In `_update_aniso_beta`, - the `BETA` branch writes the stored β straight into `cryspy_beta` - (no U→β transform); `BETA` is in the `aniso_types` set so `b_iso` - is zeroed for β atoms. This drives every refinement step. cryspy's - β convention matches CIF/SHELX `β_ij = 2π²·U_ij·a*_i·a*_j` (the - `calc_beta_by_u` path; resolved Q2). - - **CIF-build relabel (initial parse).** cryspy's CIF parser and - `apply_space_group_constraint` only understand U/B aniso tags, so - during structure-CIF generation - `_temporarily_convert_to_u_notation` - /`_stash_beta_atom_as_u`/`_beta_reciprocal_pairs` transiently - relabel a β atom as `Uani` with `U_ij = β_ij/(2π²·a*_i·a*_j)`, then - restore β. cryspy re-derives the identical `atom_beta` via the - exact inverse, so the net behaviour is preserved. (Added in Phase 2 - to fix a `u_11`-not-defined parse failure that only the full CIF - round-trip exposed; ed's `reciprocal_cell_lengths` must stay - consistent with cryspy's - `calc_reciprocal_by_unit_cell_parameters`.) - -7. **Symmetry-constraint write path bypasses validation during a fit.** - `AtomSites._apply_adp_symmetry_constraints` takes a - `called_by_minimizer` flag (mirroring the coordinate-constraint - pass). When the minimizer is driving the refinement it writes the - symmetry-averaged tensor components back through - `Parameter._set_value_from_minimizer` (raw, no validation) rather - than the `param.value` setter. This matters for β/U/B anisotropic - atoms on special positions: the minimizer explores trial values that - may be transiently below the diagonal `RangeValidator(ge=0, le=10)` - lower bound, and the averaged write-back can land out of range; the - validating setter would raise mid-fit and abort the refinement. The - interactive (non-minimizer) path keeps full validation. Covered by - `TestAdpSymmetryConstraintMinimizerBypass`. (Landed on the branch via - `1e3db602`; recorded here so the bypass is documented and tested.) - -> **Dropped (was decision 5): off-diagonal validator relaxation.** -> Verification against current code shows the aniso off-diagonal -> validators are already unrestricted `RangeValidator()`; only the -> diagonals carry `RangeValidator(ge=0.0, le=10.0)`, and β values sit -> within those bounds. No validator change is needed, so the earlier -> planned relaxation is removed from scope (resolved Q4). - -## Additional in-scope changes (recorded during implementation) - -Two user-requested changes landed on this branch beyond the original -plan steps. They are in scope but were not in the initial checklist, so -they are recorded here for an accurate Phase 2 scope. - -1. **Inline `create(adp_type='beta')` ergonomics.** Creating an atom - with an anisotropic `adp_type` and no attached parent previously - raised because the cell-dependent conversion has no reachable cell. - The `adp_type` setter now defers the conversion when the atom is - detached, and `AtomSites.add` materialises the aniso row, so - `structure.atom_sites.create(..., adp_type='beta')` yields a β atom - with zero-filled components ready for assignment. The three - single-crystal verification examples were switched to this inline - form (`18582ead`/`01446213`; examples `c5f08431`). Lightly touches - the area issue #135 / Q5 deferred — see Open questions. -2. **Jupyter dark mode + shared figure loader (`f76ea254`).** The - live-notebook loader (`ed-figures.js`) could not detect JupyterLab's - theme and only re-coloured annotation fonts, so the reflection- - comparison plot's metrics box kept its baked light-grey background in - dark mode. The loader now detects the JupyterLab (and baked) theme - and re-themes the metrics box background/border, and it is made the - single source of figure theme-sync, resize, and legend behaviour: the - standalone/report HTML path delegates to `window.edFigures` instead - of carrying duplicated inline post-scripts. Unrelated to the β tensor - but bundled here at the user's request; fully verified in Phase 2. - -## Open questions - -- **Q1 (representation). RESOLVED — first-class stored β** (confirmed - 2026-06-10). β is a persisted `adp_type` holding β values directly in - `adp_11`…`adp_23`; the I/O-only-normalise-to-`Uani` alternative is - rejected. Decision 1 stands. -- **Q2 (cryspy β convention). RESOLVED — convention matches** (confirmed - 2026-06-10 by code read). cryspy's `_update_aniso_beta` already - documents `β_ij = 2π²·U_ij·a*_i·a*_j` and routes B/U through cryspy's - `calc_beta_by_u` with reciprocal lengths from - `calc_reciprocal_by_unit_cell_parameters`. Native β can pass through - unscaled (decision 6, step P1.6). -- **Q3 (crysfml scope). RESOLVED — out of scope** (confirmed - 2026-06-10). β works on cryspy only this PR; crysfml's missing - anisotropic-tensor wiring is a pre-existing gap tracked separately, - not addressed here and not given a stop-gap. -- **Q4 (validator loosening). RESOLVED — no change needed** (confirmed - 2026-06-10 by code read). The aniso off-diagonal validators are - already unrestricted `RangeValidator()`; only diagonals carry - `RangeValidator(ge=0.0, le=10.0)`, and β values sit within those - bounds. The earlier planned relaxation (former decision 5) is dropped - from scope. -- **Q5 (creation-API UX). RESOLVED — separate ADR + plan** (confirmed - 2026-06-10). A richer atom-creation API (e.g. `b_iso=`/`u_iso=`/ - `beta=` convenience kwargs that set `adp_type` automatically, or - CIF-style auto-attach of the sibling iso/aniso values) would improve - discoverability but **revisits the accepted type-neutral ADP ADR** and - is independent of the β math. It is **out of scope** for this PR; - tracked as a follow-up issue in `docs/dev/issues/open/` and to be - designed in its own ADR + plan. This β work stays strictly inside the - type-neutral model (β is a fourth `adp_type`). -- **Q6 (β ADP-ellipsoid display). RESOLVED — deferred** (confirmed - 2026-06-10). Implementation found that `display/structure/builder.py` - draws ADP ellipsoids only for `Bani`/`Uani` and carries its own - reciprocal-length math. Drawing β ellipsoids would need a β→U - conversion in the renderer. That is **out of scope**: β atoms render - as spheres for now (no crash, no wrong ellipsoid), tracked as a - follow-up issue in `docs/dev/issues/open/` (step P1.9). The duplicate - `_reciprocal_lengths` helper is still consolidated onto the shared - helper (step P1.9); only the ellipsoid math is deferred. - -## Concrete files likely to change - -Source: - -- `src/easydiffraction/datablocks/structure/categories/atom_sites/enums.py` - — add `AdpTypeEnum.BETA`, its `description()` entry, `default()` - unaffected. -- `src/easydiffraction/datablocks/structure/categories/atom_sites/default.py` - — `_convert_adp_values` β branches; cell access for the transform; - iso↔β seeding/collapse hooks. -- `src/easydiffraction/datablocks/structure/categories/atom_site_aniso/default.py` - — `_atom_site_aniso.beta_*` CIF names; type-aware display units (no - validator change — off-diagonals already accept negatives). -- `src/easydiffraction/crystallography/crystallography.py` — new - reciprocal-cell length helper (`a* b* c*` from `a b c α β γ`; pure - geometry, sits with the existing crystallographic math). -- `src/easydiffraction/datablocks/structure/item/base.py` — only if the - cell back-reference for conversion needs wiring through the structure. -- `src/easydiffraction/io/cif/iucr_writer.py` — `_adp_family()` β case; - confirm `_atom_site_aniso_tags('beta')` and the section header read - correctly. -- `src/easydiffraction/io/cif/serialize.py` — `beta` ADP family for - project CIF round-trip (`_ADP_FAMILY_BETA`, `_adp_family_from_type`, - `_group_items_by_adp_family`). -- `src/easydiffraction/display/structure/builder.py` — route the private - `_reciprocal_lengths` through - `crystallography.reciprocal_cell_lengths` (consolidation); β atoms - render as spheres (ellipsoid display deferred, Q6). -- `src/easydiffraction/analysis/calculators/cryspy.py` — - `_update_aniso_beta` β passthrough; `aniso_types` includes `BETA`. -- `src/easydiffraction/datablocks/structure/item/base.py` and the - `atom_sites`/`atom_site_aniso` categories — extend the `{Bani, Uani}` - "is-anisotropic" membership sets to include `beta` (step P1.3). - -Docs / ADR: - -- `docs/dev/adrs/accepted/type-neutral-adp-parameters.md` — Extension - section (step P1.1). -- `docs/dev/issues/open/` — add a follow-up row for the ADP creation-API - UX (resolved Q5; step P1.1). - -Tests (Phase 2): - -- `tests/unit/easydiffraction/datablocks/structure/categories/test_atom_sites.py` -- `tests/unit/easydiffraction/datablocks/structure/categories/test_atom_site_aniso.py` -- `tests/functional/test_adp_switching.py` -- `tests/unit/easydiffraction/analysis/calculators/test_cryspy.py` -- `tests/unit/easydiffraction/io/cif/test_iucr_writer.py`, - `test_serialize.py` -- reciprocal-cell helper tests in - `tests/unit/easydiffraction/crystallography/test_crystallography.py`. -- `tests/unit/easydiffraction/display/structure/test_builder.py` — the - reciprocal-helper consolidation regression. - -## Implementation steps (Phase 1) - -Each step is one atomic commit. Stage only the files the step touches -(explicit paths). Commit locally before starting the next step. - -- [x] **P1.1 — ADP ADR extension + follow-up note.** Ensure the - _Extension_ section in `type-neutral-adp-parameters.md` records - the β decision (decisions 1–6 above): first-class `beta` type, - cell-dependent conversion, dimensionless-units display handling, - and the new CIF tags. Verify it states that off-diagonal negatives - are _already_ permitted (a statement of existing behaviour), not a - newly introduced relaxation. Also add the ADP creation-API UX - follow-up row to `docs/dev/issues/open/` (resolved Q5). Keep the - original Decision/Consequences intact. Stage the ADR and - `open.md`. Commit: `Extend type-neutral ADP ADR with beta tensor` -- [x] **P1.2 — Reciprocal-cell helper.** Add a pure-geometry helper - returning `(a*, b*, c*)` from `(a, b, c, α, β, γ)` to - `crystallography/crystallography.py` (the crystallographic-math - module). Commit: `Add reciprocal-cell length helper` -- [x] **P1.3 — `AdpTypeEnum.BETA`.** Add the member and its - `description()`. Extend the "is-anisotropic" `{Bani, Uani}` - membership sets to include `beta` so β atoms get an aniso row and - are treated as anisotropic: `item/base.py` (aniso-row sync), - `atom_sites/default.py` (symmetry-constrained flag and - collapse-from-aniso), and `atom_site_aniso/default.py` (iso-only - check). **Leave the B-vs-U sets** (`{Biso, Bani}` / - `{Uiso, Uani}`) unchanged — β is neither B nor U. **Leave - `display/structure/builder.py`'s display-shape sets unchanged** so - β atoms fall through to the sphere branch (ellipsoids deferred, - Q6). The cryspy `aniso_types` set is handled in P1.6. Search - `git grep -n "AdpTypeEnum\."`. Commit: - `Add beta member to AdpTypeEnum` -- [x] **P1.4 — aniso category: CIF names + display units.** Add - `_atom_site_aniso.beta_*` to the `adp_ij` CIF handlers (decision - 5); type-aware display units that suppress `Ų` when - `adp_type == 'beta'` (decision 4). No validator change — the - off-diagonals already accept negatives. Commit: - `Support beta tensor in atom_site_aniso category` -- [x] **P1.5 — type-switch conversion.** Extend `_convert_adp_values` - with β↔U/B branches using the reciprocal-cell helper and the - parent `cell`; raise a clear error when no cell is reachable; wire - iso↔β seeding/collapse. Commit: - `Convert ADP values to and from the beta tensor` -- [x] **P1.6 — cryspy passthrough.** Add the `BETA` branch in - `_update_aniso_beta` (store β directly, no U→β transform) and - include `BETA` in `aniso_types`. cryspy convention already - confirmed (resolved Q2). Commit: - `Pass beta tensor straight through to cryspy` -- [x] **P1.7 — CIF report writer family.** In `io/cif/iucr_writer.py`, - extend `_adp_family()` to return `'beta'` for `adp_type == 'beta'` - (full-value check, not first letter); verify the aniso loop tags - and section header for the β family. Commit: - `Emit beta tensor in the atom_site_aniso CIF loop` -- [x] **P1.8 — Project CIF serializer family.** In `io/cif/serialize.py` - add `_ADP_FAMILY_BETA`, return it from `_adp_family_from_type()` - for `adp_type == 'beta'`, and add a `beta` group to - `_group_items_by_adp_family()`. This is what makes project - save/load round-trip β. Commit: - `Round-trip beta tensor through the project CIF serializer` -- [x] **P1.9 — Consolidate reciprocal helper; defer β ellipsoids.** - Refactor `display/structure/builder.py::_reciprocal_lengths` to - call `crystallography.reciprocal_cell_lengths` (remove the - duplicate math; keep the `np.ndarray` return shape its callers - expect). β atoms keep rendering as spheres; add a follow-up issue - to `docs/dev/issues/open/` for β→U ADP-ellipsoid display (resolved - Q6). Commit: `Reuse shared reciprocal helper in structure builder` -- [x] **P1.10 — Phase 1 review gate.** No-code step. Mark complete, - commit the checklist update alone. Commit: - `Reach Phase 1 review gate` - -## Verification (Phase 2) - -Add/extend the tests listed under _Concrete files_, then run the gate. -Capture logs with the zsh-safe pattern when output is needed: - -``` -pixi run fix -pixi run check > /tmp/easydiffraction-check.log 2>&1; check_exit_code=$?; tail -n 200 /tmp/easydiffraction-check.log; exit $check_exit_code -pixi run unit-tests > /tmp/easydiffraction-unit.log 2>&1; unit_tests_exit_code=$?; tail -n 200 /tmp/easydiffraction-unit.log; exit $unit_tests_exit_code -pixi run integration-tests > /tmp/easydiffraction-integration.log 2>&1; integration_tests_exit_code=$?; tail -n 200 /tmp/easydiffraction-integration.log; exit $integration_tests_exit_code -pixi run script-tests > /tmp/easydiffraction-script.log 2>&1; script_tests_exit_code=$?; tail -n 200 /tmp/easydiffraction-script.log; exit $script_tests_exit_code -``` - -`pixi run test-structure-check` to confirm any new test module mirrors -its source path. `pixi run fix` regenerates the package-structure docs — -do not hand-edit those. - -Test coverage to add: - -- reciprocal-cell helper numerics (orthorhombic + triclinic case). -- β↔U and β↔B round-trip within tolerance, using a known cell. -- type switches: `Uani↔beta`, `Bani↔beta`, `Uiso→beta`, `beta→Biso`, - with parameter-object identity preserved (per the ADR) and a clear - error when no parent cell is present. -- regression guard: negative off-diagonal accepted (already-existing - behaviour that β relies on); diagonal still rejects negatives. -- CIF round-trip: read a `_atom_site_aniso.beta_*` loop → - `adp_type == 'beta'` → write emits the β loop unchanged (both - `iucr_writer` and the project `serialize` paths). -- project save/load round-trip: a β atom saved via `serialize.py` and - reloaded keeps `adp_type == 'beta'` and its β values (verifies the new - serializer `beta` family). -- builder helper consolidation: `display/structure/builder.py` - reciprocal lengths match the pre-refactor values for a known cell - (regression guard around the P1.9 change). -- cryspy: β atom zeroes `b_iso` and populates `atom_beta` with the - stored values. -- equivalent-iso (F1 regression): after a `Uani → beta` **and** a - `Bani → beta` switch, `adp_iso_as_b` and the serialized - `_atom_site.B_iso_or_equiv` column give the correct `B_eq` computed - from the β tensor (not a stale or double-scaled value); `_adp_iso` - collapses to a real `U_eq` after an update. -- report units (F2 regression): `report.data_context._descriptor_units` - returns unchanged units for representative **non-β** parameters — one - with a `display_handler`, one relying on the `_units` fallback — - across `latex`/`html`/`gui` contexts. -- report β tags (F4): the `iucr_writer` β round-trip emits - `_atom_site_aniso.beta_11`…`beta_23` (not a B/U-defaulted tag) under - an `Anisotropic ADP (beta)` section header. - -## Suggested Pull Request - -**Title:** Support the β-tensor anisotropic displacement convention - -**Description:** EasyDiffraction can now read, store, refine, and write -atomic displacement parameters in the dimensionless **β-tensor** -convention, alongside the existing B and U (isotropic and anisotropic) -forms. This lets you load structures and FullProf/SHELX-style data that -report anisotropic displacements as β values without hand-converting -them, switch an atom between β, U, and B with the values converted -automatically using the unit cell, and export β tensors back to CIF. -Off-diagonal displacement components may now be negative, as the physics -requires. You can also set `adp_type='beta'` directly when creating an -atom and fill in the components afterwards. - -This PR also fixes the interactive plots: figures now respect -JupyterLab's dark theme, so the comparison plot's metrics box no longer -stays light-grey in dark mode. diff --git a/docs/dev/plans/dataset-driven-fit-modes.md b/docs/dev/plans/dataset-driven-fit-modes.md new file mode 100644 index 000000000..be0fbb952 --- /dev/null +++ b/docs/dev/plans/dataset-driven-fit-modes.md @@ -0,0 +1,292 @@ +# Plan: Dataset-Driven Fit Mode Availability + +Governed by [`AGENTS.md`](../../../AGENTS.md). No deliberate exceptions +to those instructions are taken in this plan. + +Implements ADR +[`dataset-driven-fit-modes`](../adrs/accepted/dataset-driven-fit-modes.md) +(promoted from a suggestion to `accepted/` in P1.8, per §Change +Discipline). Closes issue **85 — Retain Per-Experiment Fitted Parameters +for Plotting** +([`closed/retain-per-experiment-fitted-parameters-for-plotting.md`](../issues/closed/retain-per-experiment-fitted-parameters-for-plotting.md)) +by removing the `single`-with-N path that caused it. + +## ADR + +This plan **owns** the ADR `dataset-driven-fit-modes`. Phase 1 promotes +it to `accepted/` (P1.8). The change also extends the accepted +[`fit-mode-categories`](../adrs/accepted/fit-mode-categories.md) ADR but +does not modify it. + +## Summary of the change + +- Fit-mode availability becomes **applicability-based**: + `fitting_mode.show_supported()` lists only the modes whose + applicability predicate the loaded project satisfies — `single` + (exactly one loaded experiment), `joint` (≥2), `sequential` (exactly + one loaded experiment). Readiness — each scheduled experiment has + measured data, and `sequential` has a resolvable `data_dir` matching + files — is checked at `fit()`, not for visibility. +- `single` is restricted to exactly one experiment; the multi-experiment + loop (`single`-with-N) is removed, which **closes issue 85** by + construction. +- The now-orphaned `_parameter_snapshots` / + `plot_param_series_from_snapshots` fallback is deleted; + parameter-evolution plotting stays on the `sequential` `results.csv` + path. +- `sequential` keeps its folder-sweep behaviour; its data-source config + gains a clear "no files" fit-time error and a `copy_data` flag + (default off) with a defined, idempotent round-trip contract. (The + template-derived `file_pattern` default from the ADR is deferred — see + Decision 4.) + +## Decisions (from the ADR — see it for rationale) + +1. Applicability drives `show_supported()`; readiness is enforced only + at `fit()`. The `.type` setter stays permissive (any mode), so CIF + restore keeps a stored mode and rejection happens at fit time. +2. **Applicability is by total loaded-experiment count; measured-data is + readiness.** `single`/`sequential` are applicable when the project + holds exactly one loaded experiment, `joint` when it holds ≥2 — + regardless of measured-vs-calculated. The requirement that each + scheduled experiment actually has measured data is a **readiness** + check enforced at `fit()` by the existing + `Fitter._require_measured_data` guard (a calculated-only experiment + yields a clear fit-time error, not a hidden mode). This keeps + `show_supported()` and `fit()` consistent for mixed + measured/calculated projects, and avoids any "schedule only the + measured subset" filtering. **This refines the ADR's "experiment with + measured data" applicability wording into the applicability/readiness + split; the ADR text is aligned at promotion (P1.8).** +3. `copy_data` (default `False`): at `fit()`, before the sweep, copy + matched files into `/data/sequential/`, rewrite persisted + `data_dir` to that relative destination (portability), overwrite on + name conflict; only the `sequential_fit` fields are serialized. The + copy is **idempotent**: when the resolved source directory is already + the copy destination (the post-reload case), the copy is skipped and + the run uses the archived files. +4. `data_dir` has no smart default. **`file_pattern` keeps its current + `'*'` default for the first step**; the ADR's "derive the glob from + the template experiment's data-file extension" is deferred because + the experiment model does not retain its source data-file path today + (deriving it requires new experiment source-path metadata + its + persistence and tests). The ADR's stated `'*'` fallback is therefore + the shipped first-step behaviour; the derived default is a tracked + follow-up. The ADR text is aligned at promotion (P1.8). + +## Open questions + +- **`fitting-exercise-si-lbco.py` pedagogy (P1.7) — RESOLVED.** The + audit in P1.7 found this tutorial uses two separate one-experiment + projects (not multiple experiments in one project), so it does not + rely on `single`-with-N and needs no change. No pedagogy decision is + required. +- **Resume** stays `single`-only as today; per-point `sequential` resume + is out of scope (ADR Open Questions). +- **Template-derived `file_pattern` (deferred).** Needs new experiment + source-data-path metadata (plus persistence and tests). The first step + ships the `'*'` default; a follow-up adds the source-path metadata and + derives `*.ext` from it. Tracked as Deferred Work in the ADR at + promotion (P1.8). + +## Concrete files likely to change + +Phase 1 (implementation): + +- `src/easydiffraction/analysis/categories/fitting_mode/default.py` — + `_supported_types(filters)` returns applicable modes. +- `src/easydiffraction/analysis/analysis.py` — `_supported_filters_for` + context for `fitting_mode` (loaded-experiment count); fit-time + applicability validation in `_validate_fit_request`; collapse + `_fit_single_experiments` to one experiment; remove + `_parameter_snapshots` (line ~602) and `_snapshot_params` (~3009); + sequential data-source resolution + `copy_data` handling; check + `_help_filter` / `_serializable_categories`. +- `src/easydiffraction/analysis/categories/sequential_fit/default.py` — + add `copy_data` `BoolDescriptor` + property (mirror `reverse`). +- `src/easydiffraction/analysis/sequential.py` — no-files error; + idempotent copy (skip self-copy) + `data_dir` rewrite. +- `src/easydiffraction/display/plotting.py` — remove + `plot_param_series_from_snapshots` and the snapshot fallback branches + in `plot_param_series`, `plot_all_param_series`, + `_collect_fitted_parameter_unique_names`; no-CSV path warns and + returns. +- `docs/docs/tutorials/fitting-exercise-si-lbco.py` (+ regenerated + `.ipynb`) — stop relying on `single`-with-N. +- `docs/dev/issues/...` — move issue 85 to `closed/`, update index. +- `docs/dev/adrs/...` — promote ADR to `accepted/`, update index, fix + status/links. + +Phase 2 (verification — tests): + +- `tests/unit/easydiffraction/analysis/categories/test_fitting_mode*` / + `test_fitting*` — applicability predicates given experiment counts. +- `tests/unit/easydiffraction/analysis/test_analysis*` — fit-time + validation errors; single-exactly-one. +- `tests/.../sequential_fit` + + `tests/integration/fitting/test_sequential.py` — `copy_data` + round-trip, idempotent self-copy (source == destination skips), + no-files error. +- `tests/unit/easydiffraction/display/test_plotting_coverage.py`, + `tests/.../test_analysis_coverage.py`, + `tests/integration/fitting/test_analysis_and_fit_category_support.py` + — drop snapshot-based expectations; assert no-CSV warns. + +## Branch and PR notes + +- Branch: `dataset-driven-fit-modes` (already checked out). +- PR targets `develop`. Do not push unless asked. + +## Implementation steps (Phase 1) + +Per §Planning and §Commits: when an AI agent follows this plan, every +completed Phase 1 step is staged with **explicit paths** and committed +locally (atomic, single-purpose) before moving on. Mark each `- [ ]` as +`- [x]` in the same commit that completes it. + +Applicability/readiness contract used across P1.1–P1.3 (Decision 2): +applicability counts **total loaded experiments**; the measured-data +requirement is a **readiness** check left to the existing +`Fitter._require_measured_data` guard at `fit()`. No "schedule only the +measured subset" filtering is introduced. + +- [x] **P1.1 — Applicability-based `show_supported()`.** Add an Analysis + helper that returns the loaded-experiment count + (`len(project.experiments)`); have `_supported_filters_for` pass + that count to the `fitting_mode` category; implement + `FittingMode._supported_types(filters)` to return `single` (count + == 1), `joint` (count ≥ 2), `sequential` (count == 1). Files: + `analysis.py`, `fitting_mode/default.py`. Commit: + `Offer fit modes by project applicability` + +- [x] **P1.2 — Enforce mode preconditions at fit time.** In + `_validate_fit_request`, reject a selected mode whose + applicability predicate the project does not meet — + `single`/`sequential` unless exactly one experiment is loaded, + `joint` unless ≥2 — with a clear `ValueError` naming the valid + modes. Keep the `.type` setter permissive so CIF restore is not + rejected. Measured-data readiness stays with + `Fitter._require_measured_data` (no change needed there). Files: + `analysis.py`. Commit: + `Validate fit mode against loaded experiments` + +- [x] **P1.3 — Restrict `single` to exactly one experiment.** With P1.2 + guaranteeing exactly one loaded experiment for `single`, collapse + `_fit_single_experiments` to fit that one experiment (no loop); + drop the per-experiment `_snapshot_params` call. Files: + `analysis.py`. Commit: `Fit a single experiment in single mode` + +- [x] **P1.4 — Remove the `single`-with-N snapshot machinery.** Delete + `_parameter_snapshots` and `_snapshot_params`; remove + `plot_param_series_from_snapshots` and the snapshot fallback + branches in `plot_param_series`, `plot_all_param_series`, and + `_collect_fitted_parameter_unique_names`; when no `results.csv` + exists, log a clear warning and return. Files: `analysis.py`, + `display/plotting.py`. Commit: + `Remove single-mode parameter snapshot fallback` + +- [x] **P1.5 — Sequential data source: no-files error, `copy_data`.** + Add `copy_data` `BoolDescriptor` (+ property, default `False`) to + `SequentialFit` (mirror `reverse`). In sequential resolution: + raise a clear `ValueError` when `data_dir` is unset/unresolvable + or matches no files (keeping the current `'*'` `file_pattern` + default — the template-extension derivation is deferred, Decision + 4). When `copy_data` is set, apply the **idempotent** copy: if the + resolved source directory is already the copy destination + (`/data/sequential/`, the post-reload case), skip the + copy and run from the archived files; otherwise copy matched files + there (overwrite on name conflict) and rewrite the persisted + `data_dir` to that relative destination. Files: + `sequential_fit/default.py`, `sequential.py`, `analysis.py`. + Commit: `Add copy_data and clearer sequential data resolution` + +- [x] **P1.6 — Reconcile display/serialization filters.** Verify + `_help_filter` and `_serializable_categories` reflect the active + mode under the new model (sequential config hidden in `single`, + etc.); adjust only if needed. Files: `analysis.py`. Commit: + `Align analysis category visibility with fit modes` + +- [x] **P1.7 — Audit tutorials for `single`-with-N (no change needed).** + An audit of every `docs/docs/tutorials/*.py` found **no** tutorial + fits ≥2 experiments in `single`/default mode: + `fitting-exercise-si-lbco.py` uses two _separate_ projects with + one experiment each (`sim_si`, `sim_lbco`) — its repeated `fit()` + calls are progressive single-experiment refinements, not + `single`-with-N; the multi-experiment tutorials + (`calibrate-beer-ess`, `joint-si-bragg-pdf`, `refine-ncaf-wish`) + all set `joint`/`sequential` before fitting. So the earlier + premise that the Si/LBCO exercise relied on `single`-with-N was + incorrect, and no tutorial edit (or `notebook-prepare`) is + required. Commit: `Confirm no tutorial relies on single-with-N` + +- [x] **P1.8 — Close issue 85 and promote the ADR.** `git mv` issue 85 + to + `closed/retain-per-experiment-fitted-parameters-for-plotting.md`, + rewrite its body to describe the resolution (single restricted to + one dataset; snapshot fallback removed), and update + `docs/dev/issues/index.md`. Promote the ADR: `git mv` + `docs/dev/adrs/suggestions/dataset-driven-fit-modes.md` → + `accepted/`, set its `## Status` to `Accepted`, flip its + `docs/dev/adrs/index.md` row to `Accepted` with the `accepted/...` + link, fix any links that pointed at `suggestions/...` + (`git grep -n`), and remove the `_review-*`/`_reply-*` siblings if + still present. **Align the ADR text to the refinements made during + plan review:** Decision 1/applicability wording → + loaded-experiment count with measured-data as readiness (plan + Decision 2); Decision 4 → move the template-derived `file_pattern` + default to Deferred Work and state `'*'` as the shipped default + (plan Decision 4); Decision 3 → note the idempotent self-copy + skip. Files: `docs/dev/issues/...`, `docs/dev/adrs/...`. Commit: + `Close issue 85 and accept dataset-driven fit modes ADR` + +- [x] **P1.9 — Phase 1 review gate.** No-code step. Mark `[x]`, commit + the checklist update alone with message + `Reach Phase 1 review gate`, then stop for Phase 1 review. + +## Verification (Phase 2) + +Add/update tests first (see "Concrete files" Phase 2 list), then run the +full suite. Use the zsh-safe log-capture pattern from §Workflow. + +```bash +pixi run fix +pixi run check > /tmp/easydiffraction-check.log 2>&1; check_exit_code=$?; tail -n 200 /tmp/easydiffraction-check.log; exit $check_exit_code +pixi run unit-tests > /tmp/easydiffraction-unit.log 2>&1; unit_tests_exit_code=$?; tail -n 200 /tmp/easydiffraction-unit.log; exit $unit_tests_exit_code +pixi run integration-tests > /tmp/easydiffraction-integration.log 2>&1; integration_tests_exit_code=$?; tail -n 200 /tmp/easydiffraction-integration.log; exit $integration_tests_exit_code +pixi run script-tests > /tmp/easydiffraction-script.log 2>&1; script_tests_exit_code=$?; tail -n 200 /tmp/easydiffraction-script.log; exit $script_tests_exit_code +pixi run test-structure-check +``` + +Notes: + +- `pixi run fix` regenerates `docs/dev/package-structure/full.md` and + `short.md`; include them only if changed. +- Tutorial project-path collisions, benchmark CSVs, and sandbox-only + multiprocessing failures: handle per §Workflow. + +## Status checklist + +- [x] P1.1 — Applicability-based `show_supported()` +- [x] P1.2 — Enforce mode preconditions at fit time +- [x] P1.3 — Restrict `single` to exactly one experiment +- [x] P1.4 — Remove the `single`-with-N snapshot machinery +- [x] P1.5 — Sequential data source: no-files error, `copy_data` +- [x] P1.6 — Reconcile display/serialization filters +- [x] P1.7 — Update tutorials relying on `single`-with-N +- [x] P1.8 — Close issue 85 and promote the ADR +- [x] P1.9 — Phase 1 review gate +- [x] Phase 2 — tests added and full verification suite green + +## Suggested Pull Request + +**Title:** Clearer fit modes that match your loaded data + +**Description:** EasyDiffraction now offers only the fitting modes that +make sense for what you have loaded: with one dataset you get a normal +single fit (and can switch to a sequential scan over a folder of files); +with several datasets you get a combined (joint) fit. This removes a +confusing case where fitting several datasets one-by-one could make +earlier datasets plot incorrectly. Sequential scans are also clearer to +set up: you get a clear message when no data files are found, and a new +option can copy the scanned files into your project so it stays +self-contained and portable. diff --git a/docs/dev/plans/rename-asym-empir-to-beba.md b/docs/dev/plans/rename-asym-empir-to-beba.md deleted file mode 100644 index 3a6e0dbd6..000000000 --- a/docs/dev/plans/rename-asym-empir-to-beba.md +++ /dev/null @@ -1,287 +0,0 @@ -# Plan: Rename empirical asymmetry to Bérar–Baldinozzi (`asym_beba_*`) - -Follows [`AGENTS.md`](../../../AGENTS.md). No deliberate exceptions to -those instructions. - -## ADR - -No new ADR is required. This renames parameters and identifiers within -the **existing** switchable `peak` profile category (one peak-profile -type); it does not change the switchable-category mechanism, the factory -contract, or persistence design. Touches but does not alter the -decisions in -[`switchable-category-owned-selectors`](../adrs/accepted/switchable-category-owned-selectors.md) -and -[`edstar-project-persistence`](../adrs/accepted/edstar-project-persistence.md) -(the persistence handler inventory lists the renamed tags and must be -kept in sync). - -## Why - -The four `asym_empir_1…4` parameters are the **Bérar–Baldinozzi (1993)** -empirical asymmetry correction. The generic `empir` tag hides the model -and the meaningless `_1…_4` suffixes hide each coefficient's role. This -plan renames them to a model-named, coefficient-named scheme consistent -with the sibling `asym_fcj_*` set, and adds a short, neutral note to the -affected Verification pages about the cryspy/FullProf implementation -difference. Implements the **rename half** of issue 133 and surfaces the -finding documented in issue 166 (issue 133 stays open for the separate -"add the physical FCJ model" item). - -## Branch + PR - -- Planned branch: `rename-asym-empir-to-beba` (flat slug off `develop`). -- **Deliberate exception (AGENTS.md §Planning).** Phase 1 was - implemented on the **current `sequential-fix` branch** at the user's - explicit instruction ("draft-impl-1 … in the current branch") instead - of creating the planned `rename-asym-empir-to-beba` branch. The beba - Phase 1 commits therefore sit on top of pre-existing, unrelated - `sequential-fix` commits (e.g. - `d18d36b8 Use id instead of label in lab6 nosldl atom sites`), which - are **not** part of this rename and were not produced by this work. - When the PR is opened, scope the base / commit range to exclude those - unrelated commits, or land the rename via this branch as the user - directs. -- PR targets `develop`, not `master`. Do not push unless asked. - -## Naming scheme - -`asym_empir` → `asym_beba` (BÉrar–BAldinozzi), with the numeric suffixes -replaced by the paper's coefficient names. Confirmed mapping (cryspy -`asymmetry_parameters[0..3]` and the CIF maps in both calculators): - -| current | new | paper coeff | basis term | FullProf | cryspy slot | -| -------------- | -------------- | ----------- | ------------- | ----------- | ----------- | -| `asym_empir_1` | `asym_beba_a0` | `A₀` | `Fa / tan θ` | `Asy1`/`P1` | `[0]` | -| `asym_empir_2` | `asym_beba_b0` | `B₀` | `Fb / tan θ` | `Asy2`/`P2` | `[1]` | -| `asym_empir_3` | `asym_beba_a1` | `A₁` | `Fa / tan 2θ` | `Asy3`/`P3` | `[2]` | -| `asym_empir_4` | `asym_beba_b1` | `B₁` | `Fb / tan 2θ` | `Asy4`/`P4` | `[3]` | - -Companion identifiers (parallel to the `Fcj…` set): - -| kind | current | new | -| ---------------- | -------------------------------------- | ----------------------------------------------- | -| mixin | `EmpiricalAsymmetryMixin` | `BerarBaldinozziAsymmetryMixin` | -| peak class | `CwlPseudoVoigtEmpiricalAsymmetry` | `CwlPseudoVoigtBerarBaldinozziAsymmetry` | -| enum member | `CWL_PSEUDO_VOIGT_EMPIRICAL_ASYMMETRY` | `CWL_PSEUDO_VOIGT_BERAR_BALDINOZZI_ASYMMETRY` | -| enum value | `cwl-pseudo-voigt-empirical-asymmetry` | `cwl-pseudo-voigt-berar-baldinozzi-asymmetry` | -| user type string | `pseudo-voigt + empirical asymmetry` | `pseudo-voigt + berar-baldinozzi asymmetry` | -| edi tag | `_peak.asym_empir_N` | `_peak.asym_beba_{a0,b0,a1,b1}` | -| CIF tag | `_easydiffraction_peak.asym_empir_N` | `_easydiffraction_peak.asym_beba_{a0,b0,a1,b1}` | - -Each `Parameter` gains a `DisplayHandler` (matching the `broad_*` -style): display names `A₀, B₀, A₁, B₁`, LaTeX -`$A_0$, $B_0$, $A_1$, $B_1$`, and a description that states the model -and term, e.g. _"Bérar–Baldinozzi asymmetry coefficient B₀ (Fb/tan θ -term)"_. - -## Neutral mismatch note (verbatim text to add to Verification pages) - -Phrased to **not** assign correctness to either program: - -> **Note — cryspy vs FullProf.** cryspy and FullProf implement the -> Bérar–Baldinozzi empirical asymmetry with different conventions. As a -> result the four `asym_beba_*` parameters do **not** transfer -> one-to-one between cryspy and FullProf — the same numbers give -> different profiles. The refinement below frees the asymmetry so it can -> absorb this convention difference; the structural results are -> unaffected. See development issue 166 for the detailed comparison. - -## Decisions - -- Use `beba` (not `bb` or full `berar_baldinozzi`): matches the short - model-tag style of `fcj`, while the coefficient suffixes (`a0`, `b0`, - `a1`, `b1`) carry the physical meaning; full "Bérar–Baldinozzi" lives - in every `description`/docstring for discoverability. -- Beta software, **no legacy shims** (AGENTS.md): old CIF/edi tags and - the old type string are removed, not aliased. Saved projects using the - old tags will not round-trip; this is acceptable for beta. -- The Verification note assigns no correctness verdict (per request); - the detailed who-matches-the-paper analysis stays in issue 166 only. -- Rename the dedicated empirical page file - `pd-neut-cwl_pv-asym_empir_pbso4` → `pd-neut-cwl_pv-beba_pbso4` (keeps - the page slug consistent with the new parameter tag). - -## Open questions - -1. Page-file rename — confirm `pd-neut-cwl_pv-beba_pbso4` as the new - slug (vs keeping the old filename). Plan assumes the rename. -2. Display glyphs: use Unicode subscripts (`A₀`) in `display_name`, or - ASCII (`A0`)? Plan assumes Unicode to match the `deg²` precedent; - fall back to ASCII if any renderer/test objects. -3. Should issue 133 be edited to mark the rename done (leaving only the - FCJ-model item)? Plan does this in P1.8; revert if you'd rather keep - 133 untouched until the FCJ work lands. - -## Concrete files likely to change - -Source: - -- `src/easydiffraction/datablocks/experiment/categories/peak/cwl_mixins.py` -- `src/easydiffraction/datablocks/experiment/categories/peak/cwl.py` -- `src/easydiffraction/datablocks/experiment/categories/peak/__init__.py` -- `src/easydiffraction/datablocks/experiment/categories/peak/factory.py` -- `src/easydiffraction/datablocks/experiment/item/enums.py` -- `src/easydiffraction/analysis/calculators/cryspy.py` -- `src/easydiffraction/analysis/calculators/crysfml.py` - -Verification / docs / tutorials: - -- `docs/docs/verification/pd-neut-cwl_pv-asym_empir_pbso4.py` → renamed - `…/pd-neut-cwl_pv-beba_pbso4.py` (+ regenerated `.ipynb`) -- `docs/docs/verification/pd-neut-cwl_pv-beta_y2o3.py` (+ `.ipynb`) -- `docs/docs/verification/index.md`, - `docs/docs/verification/ci_skip.txt` -- `docs/mkdocs.yml` -- `docs/docs/tutorials/refine-cosio-d20.py`, - `docs/docs/tutorials/refine-hs-hrpt.py` (+ regenerated `.ipynb`) -- `docs/docs/user-guide/parameters/experiment/peak.md` -- `docs/dev/adrs/accepted/edstar-project-persistence.md`, - `docs/dev/adrs/accepted/edstar-project-persistence/handler-inventory.json`, - `docs/dev/adrs/accepted/python-cif-category-correspondence.md` -- `docs/dev/issues/open/high_rename-asym-empir-and-add-the-physical-fcj-asymmetry-model.md` - (mark rename done) -- `docs/dev/issues/open/medium_cryspy-fullprof-berar-baldinozzi-empirical-asymmetry-convention-mismatch.md` - (issue 166 — rewrite current-name references to the new tags/slug; see - P1.8) -- `docs/dev/package-structure/full.md` (regenerated by `pixi run fix` in - Phase 2 — do not hand-edit) - -Tests (existing references only — update, do not add new tests in P1): - -- `tests/unit/easydiffraction/datablocks/experiment/categories/peak/test_cwl.py` -- `tests/.../peak/test_cwl_mixins.py`, `…/peak/test_factory.py` -- `tests/.../experiment/item/test_base.py`, `test_base_coverage.py`, - `test_factory.py` -- `tests/unit/easydiffraction/io/cif/test_parse.py` - -Use -`git grep -n "asym_empir\|empirical asymmetry\|EmpiricalAsymmetry\|empirical-asymmetry"` -to catch every occurrence before each commit. - -## Implementation steps (Phase 1) - -Each `- [ ]` is one atomic commit. Stage only the listed paths. -AGENTS.md applies: commit each completed step locally before starting -the next. - -- [x] **P1.1 — Rename model parameters and mixin.** In `cwl_mixins.py` - rename `EmpiricalAsymmetryMixin` → - `BerarBaldinozziAsymmetryMixin`; rename the four `_asym_empir_*` - parameters, properties, and setters to `asym_beba_{a0,b0,a1,b1}`; - update `name=`, `description=`, `DisplayHandler` (add `A₀…B₁`), - and `edi_names`/`cif_names` (`_peak.asym_beba_*`). Commit: - `Rename empirical asymmetry params to beba in peak mixin` - -- [x] **P1.2 — Rename peak class, enum, factory, package init.** Update - `cwl.py` (`CwlPseudoVoigtBerarBaldinozziAsymmetry`, mixin import, - docstrings), `enums.py` (member, value - `cwl-pseudo-voigt-berar-baldinozzi-asymmetry`, `description()` - text), `factory.py` (type string - `pseudo-voigt + berar-baldinozzi asymmetry`), `peak/__init__.py` - (import the renamed class). Commit: - `Rename pseudo-Voigt empirical-asymmetry peak type to beba` - -- [x] **P1.3 — Update calculators.** In `cryspy.py` update the four - `experiment.peak.asym_empir_* → asym_beba_*` reads and the CIF map - (`asym_beba_a0 → _pd_instr_reflex_asymmetry_p1`, etc., preserving - the `p1…p4` order). Same CIF map in `crysfml.py`. Commit: - `Map beba asymmetry params in cryspy and crysfml calculators` - -- [x] **P1.4 — Update existing test references.** Mechanically rename - `asym_empir_*`, the mixin/class/enum names, and the type string - across the test files listed above. No new tests (those belong to - Phase 2). Commit: `Update tests for beba asymmetry rename` - -- [x] **P1.5 — Rename and update the PbSO₄ empirical Verification - page.** `git mv` `pd-neut-cwl_pv-asym_empir_pbso4.py` → - `pd-neut-cwl_pv-beba_pbso4.py`; update parameter names; add the - neutral mismatch note (markdown cell); update its FullProf - reference folder reference if the slug is embedded. Update - `verification/index.md`, `ci_skip.txt`, and `mkdocs.yml` to the - new slug. Run `pixi run notebook-prepare`; stage the regenerated - `.ipynb`. Commit: - `Rename pbso4 empirical page to beba and add mismatch note` - -- [x] **P1.6 — Update the Y₂O₃ Verification page.** In - `pd-neut-cwl_pv-beta_y2o3.py` rename - `asym_empir_1/2 → asym_beba_a0/b0` (and the `FULLPROF_ASY_*` - constants/comments as needed); replace the existing "refines to - opposite sign (cryspy #50)" wording with the neutral mismatch - note. Run `pixi run notebook-prepare`; stage the regenerated - `.ipynb`. Commit: - `Use beba names and neutral mismatch note on y2o3 page` - -- [x] **P1.7 — Update tutorials.** Rename the parameter references in - `refine-cosio-d20.py` and `refine-hs-hrpt.py`. Run - `pixi run notebook-prepare`; stage the regenerated `.ipynb` - siblings. Commit: `Use beba asymmetry names in tutorials` - -- [x] **P1.8 — Update remaining docs, issue 133, and issue 166.** Update - `user-guide/parameters/experiment/peak.md`, the persistence ADR + - `handler-inventory.json`, and - `python-cif-category-correspondence.md` to the new tags. In issue - 133, mark the rename item done (leave the "add physical FCJ model" - item open). In **issue 166** rewrite every current-name reference - to the new terminology — `asym_empir_1…4` → - `asym_beba_a0/b0/a1/b1`, the page slug - `pd-neut-cwl_pv-asym_empir_pbso4` → `pd-neut-cwl_pv-beba_pbso4`, - and the type string / class / enum names — so the open mismatch - issue does not point readers at removed names; preserve any - deliberate historical "formerly `asym_empir_*`" context that stays - useful (e.g. the FullProf-side `P1…P4`/`Asy1…4` labels and any - "previously named" aside). Add a cross-reference to issue 166 from - the affected pages where natural. Commit: - `Update docs, issue 133, and issue 166 for beba asymmetry rename` - -- [x] **P1.9 — Phase 1 review gate.** No-code step; mark `[x]` and - commit the checklist update alone. Commit: - `Reach Phase 1 review gate` - -## Phase 2 — Verification - -Run from the repo root; capture logs with the zsh-safe pattern when -output is needed for analysis. - -- `pixi run fix` — regenerates - `docs/dev/package-structure/{full,short}.md`; commit those with the - auto-fixes. - `pixi run fix > /tmp/edi-fix.log 2>&1; fix_exit_code=$?; tail -n 80 /tmp/edi-fix.log; exit $fix_exit_code` -- `pixi run check > /tmp/edi-check.log 2>&1; check_exit_code=$?; tail -n 200 /tmp/edi-check.log; exit $check_exit_code` -- `pixi run unit-tests > /tmp/edi-unit.log 2>&1; unit_tests_exit_code=$?; tail -n 120 /tmp/edi-unit.log; exit $unit_tests_exit_code` -- `pixi run integration-tests > /tmp/edi-int.log 2>&1; integration_tests_exit_code=$?; tail -n 120 /tmp/edi-int.log; exit $integration_tests_exit_code` -- `pixi run script-tests > /tmp/edi-script.log 2>&1; script_tests_exit_code=$?; tail -n 120 /tmp/edi-script.log; exit $script_tests_exit_code` -- Confirm `pixi run test-structure-check` still passes (any renamed test - file must keep mirroring its renamed source). - -Add/adjust **new** unit tests in Phase 2 only: a parse/round-trip test -for the new `_peak.asym_beba_*` CIF tags, and a peak-factory test for -the new type string and class. - -## Status checklist - -- [x] P1.1 model params + mixin -- [x] P1.2 peak class + enum + factory + init -- [x] P1.3 calculators -- [x] P1.4 existing test references -- [x] P1.5 PbSO₄ page rename + note -- [x] P1.6 Y₂O₃ page note -- [x] P1.7 tutorials -- [x] P1.8 docs + issues 133 & 166 -- [x] P1.9 Phase 1 review gate -- [ ] Phase 2 verification green - -## Suggested Pull Request - -**Title:** Clearer names for the Bérar–Baldinozzi peak-asymmetry -settings - -**Description:** The four "empirical asymmetry" settings on a -constant-wavelength pseudo-Voigt peak are now named after the -Bérar–Baldinozzi model they implement — `asym_beba_a0`, `asym_beba_b0`, -`asym_beba_a1`, `asym_beba_b1` — so each value's role is clear instead -of an anonymous `1…4`. The relevant verification pages gain a short note -explaining that cryspy and FullProf apply this asymmetry with different -conventions, so the same parameter values are not directly -interchangeable between the two programs. No change to fitted structures -or to any other peak setting. diff --git a/src/easydiffraction/analysis/analysis.py b/src/easydiffraction/analysis/analysis.py index 4ab0bf7e3..e48d5420d 100644 --- a/src/easydiffraction/analysis/analysis.py +++ b/src/easydiffraction/analysis/analysis.py @@ -599,7 +599,6 @@ def __init__(self, project: object) -> None: self._persisted_fit_state_sidecar: dict[str, object] = {} self._fitter = Fitter(self.minimizer.type) self._fit_results = None - self._parameter_snapshots: dict[str, dict[str, dict]] = {} self._display = AnalysisDisplay(self) self._attach_category_parents() @@ -617,17 +616,21 @@ def _attach_category_parents(self) -> None: self._fit_parameter_correlations._parent = self self._software._parent = self - @staticmethod - def _supported_filters_for(category: object) -> dict[str, object]: + def _loaded_experiment_count(self) -> int: + """Return the number of experiments loaded in the project.""" + return len(self.project.experiments.names) + + def _supported_filters_for(self, category: object) -> dict[str, object]: """ Return owner context filters for a switchable category. - Analysis-level switchables (minimizer, fitting_mode) have no - owner-supplied context today; their supported-types lookups read - only the registered factory entries. The empty dict is therefore - intentional and applies uniformly across both categories. + The ``fitting_mode`` selector is applicability-driven: it needs + the loaded-experiment count to decide which modes apply. Other + analysis-level switchables (minimizer) have no owner-supplied + context and receive an empty dict. """ - del category + if category is self._fitting_mode: + return {'experiment_count': self._loaded_experiment_count()} return {} @staticmethod @@ -1490,6 +1493,36 @@ def _resolved_resume_request( return True, self._resolved_resume_extra_steps(extra_steps) + def _require_mode_applicable(self, mode: FitModeEnum) -> None: + """ + Reject a fit mode that does not apply to the loaded project. + + Applicability is by loaded-experiment count and uses the same + predicate as ``fitting_mode.show_supported()``. Whether each + scheduled experiment has measured data is a separate readiness + check enforced later by the fitter. + + Parameters + ---------- + mode : FitModeEnum + The selected fitting mode. + + Raises + ------ + ValueError + If the mode does not apply to the current experiment count. + """ + count = self._loaded_experiment_count() + valid = [tag for tag, _ in FittingMode._supported_types({'experiment_count': count})] + if mode.value in valid: + return + valid_text = ', '.join(repr(tag) for tag in valid) if valid else 'none' + msg = ( + f'Fit mode {mode.value!r} does not apply to a project with ' + f'{count} loaded experiment(s). Applicable mode(s): {valid_text}.' + ) + raise ValueError(msg) + def _validate_fit_request( self, *, @@ -1531,6 +1564,7 @@ def _validate_fit_request( raise ValueError(msg) if resume and extra_steps is not None: self._validate_resume_extra_steps(extra_steps) + self._require_mode_applicable(mode) @staticmethod def _validate_resume_extra_steps(extra_steps: object) -> int: @@ -2692,6 +2726,99 @@ def _resolve_sequential_data_dir(self) -> Path: return project_path / data_dir + def _resolve_sequential_source(self) -> str: + """ + Resolve the sequential data directory, applying ``copy_data``. + + Raises a clear error when no data directory is configured. When + ``copy_data`` is set, the matched files are copied into the + project's ``data/sequential/`` folder and the persisted + ``data_dir`` is rewritten to that project-relative destination + so the saved project stays self-contained. The copy is + idempotent: when the resolved source is already the copy + destination (the post-reload case), the copy is skipped. + + Returns + ------- + str + The directory to read sequential data files from. + + Raises + ------ + ValueError + If ``data_dir`` is unset, does not resolve to a directory + with matching files, or (with ``copy_data``) the project is + unsaved or ``data_dir`` overlaps the managed archive folder. + """ + from easydiffraction.io.ascii import extract_data_paths_from_dir # noqa: PLC0415 + + if not str(self._sequential_fit.data_dir.value).strip(): + msg = ( + 'Sequential fitting needs a data folder. Set ' + 'analysis.sequential_fit.data_dir to the directory containing ' + 'your sequential data files (and analysis.sequential_fit.file_pattern ' + 'to match them).' + ) + raise ValueError(msg) + + source = self._resolve_sequential_data_dir() + + file_pattern = self._sequential_fit.file_pattern.value + try: + matched = extract_data_paths_from_dir(source, file_pattern=file_pattern) + except (FileNotFoundError, ValueError) as error: + msg = ( + 'No sequential data files found. Check that ' + f'analysis.sequential_fit.data_dir ({source}) exists and that ' + f'analysis.sequential_fit.file_pattern ({file_pattern!r}) matches ' + 'your data files.' + ) + raise ValueError(msg) from error + + if not self._sequential_fit.copy_data.value: + return str(source) + + project_path = self.project.metadata.path + if project_path is None: + msg = ( + 'Sequential fitting with copy_data requires a saved project; call save_as() first.' + ) + raise ValueError(msg) + + destination = project_path / 'data' / 'sequential' + source_resolved = source.resolve() + destination_resolved = destination.resolve() + if source_resolved == destination_resolved: + return str(source) + + # Refusing overlapping source/destination keeps the refresh + # rmtree below from ever deleting the user's source files: a + # data_dir nested in (or containing) the managed archive would + # otherwise be wiped before it is copied. + nested = source_resolved.is_relative_to(destination_resolved) + contains = destination_resolved.is_relative_to(source_resolved) + if nested or contains: + msg = ( + 'With copy_data=True, analysis.sequential_fit.data_dir must be a ' + f'folder separate from the managed archive at {destination}; got a ' + f'nested or containing path ({source}).' + ) + raise ValueError(msg) + + import shutil # noqa: PLC0415 + + # Refresh the archive to hold exactly the current matched set. + # The self-copy case returned above and overlapping paths were + # rejected, so this never deletes the source while reading it. + if destination.exists(): + shutil.rmtree(destination) + destination.mkdir(parents=True, exist_ok=True) + for path in matched: + shutil.copy2(path, destination / Path(path).name) + + self._sequential_fit.data_dir = str(Path('data') / 'sequential') + return str(destination) + def _prepare_fit_run( self, *, @@ -2795,7 +2922,7 @@ def _run_sequential(self) -> None: try: _fit_seq( analysis=self, - data_dir=str(self._resolve_sequential_data_dir()), + data_dir=self._resolve_sequential_source(), max_workers=max_workers, chunk_size=chunk_size, file_pattern=self._sequential_fit.file_pattern.value, @@ -2881,7 +3008,7 @@ def _fit_single( fit_options: FitterFitOptions, ) -> None: """ - Run single-mode fitting for each experiment independently. + Run single-mode fitting for the one loaded experiment. Parameters ---------- @@ -2890,28 +3017,20 @@ def _fit_single( structures : object Project structures collection. experiments : object - Project experiments collection. + Project experiments collection (exactly one experiment in + single mode). fit_options : FitterFitOptions Execution options controlling limits, randomness and resume. - - Raises - ------ - ValueError - If resume is requested for more than one single-fit - experiment. """ mode = FitModeEnum.SINGLE expt_names = experiments.names - if fit_options.resume and len(expt_names) != 1: - msg = 'Resume is supported for one single-fit experiment at a time.' - raise ValueError(msg) short_display_handle = self._fit_single_print_header(verb, expt_names, mode) short_rows: list[list[str]] = [] self.fitter.minimizer.tracker._set_shared_display_handle(short_display_handle) try: - self._fit_single_experiments( + self._fit_single_experiment( verb, structures, experiments, @@ -2926,7 +3045,7 @@ def _fit_single( with suppress(Exception): short_display_handle.close() - def _fit_single_experiments( + def _fit_single_experiment( self, verb: VerbosityEnum, structures: object, @@ -2935,40 +3054,38 @@ def _fit_single_experiments( fit_options: FitterFitOptions, short_state: tuple[list[list[str]], object], ) -> None: - """Run the per-experiment loop for single-fit mode.""" + """Fit the single loaded experiment in single-fit mode.""" short_rows, short_display_handle = short_state - for expt_name in experiments.names: - if verb is VerbosityEnum.FULL: - console.print( - f"📋 Using experiment 🔬 '{expt_name}' for " - f"'{FitModeEnum.SINGLE.value}' fitting" - ) - - experiment = experiments[expt_name] - self.fitter.fit( - structures, - [experiment], - analysis=self, - verbosity=verb, - options=FitterFitOptions( - use_physical_limits=fit_options.use_physical_limits, - random_seed=self._resolved_fit_random_seed(fit_options.random_seed), - resume=fit_options.resume, - extra_steps=fit_options.extra_steps, - ), + expt_name = next(iter(experiments.names)) + if verb is VerbosityEnum.FULL: + console.print( + f"📋 Using experiment 🔬 '{expt_name}' for '{FitModeEnum.SINGLE.value}' fitting" ) - results = self.fitter.results - self._snapshot_params(expt_name, results) - self.fit_results = results + experiment = experiments[expt_name] + self.fitter.fit( + structures, + [experiment], + analysis=self, + verbosity=verb, + options=FitterFitOptions( + use_physical_limits=fit_options.use_physical_limits, + random_seed=self._resolved_fit_random_seed(fit_options.random_seed), + resume=fit_options.resume, + extra_steps=fit_options.extra_steps, + ), + ) + + results = self.fitter.results + self.fit_results = results - if verb is VerbosityEnum.SHORT: - self._fit_single_update_short_table( - short_rows, - expt_name, - results, - short_display_handle, - ) + if verb is VerbosityEnum.SHORT: + self._fit_single_update_short_table( + short_rows, + expt_name, + results, + short_display_handle, + ) @staticmethod def _fit_single_print_header( @@ -3006,26 +3123,6 @@ def _fit_single_print_header( console.print('📈 Goodness-of-fit (reduced χ²) per experiment:') return make_display_handle() - def _snapshot_params(self, expt_name: str, results: object) -> None: - """ - Snapshot parameter values for a single experiment. - - Parameters - ---------- - expt_name : str - Experiment name key for the snapshot dict. - results : object - Fit results with ``.parameters`` list. - """ - snapshot: dict[str, dict] = {} - for param in results.parameters: - snapshot[param.unique_name] = { - 'value': param.value, - 'uncertainty': param.uncertainty, - 'units': _parameter_display_units(param), - } - self._parameter_snapshots[expt_name] = snapshot - def _fit_single_update_short_table( self, short_rows: list[list[str]], diff --git a/src/easydiffraction/analysis/categories/fitting_mode/default.py b/src/easydiffraction/analysis/categories/fitting_mode/default.py index 5fa5bc1c7..3ae5004cd 100644 --- a/src/easydiffraction/analysis/categories/fitting_mode/default.py +++ b/src/easydiffraction/analysis/categories/fitting_mode/default.py @@ -15,6 +15,9 @@ from easydiffraction.core.variable import StringDescriptor from easydiffraction.io.cif.handler import TagSpec +# Minimum loaded experiments for which joint fitting applies. +_MINIMUM_JOINT_EXPERIMENTS = 2 + @FittingModeFactory.register class FittingMode(CategoryItem, SwitchableCategoryBase): @@ -52,6 +55,13 @@ def __init__(self) -> None: def _supported_types( filters: dict[str, object], ) -> list[tuple[str, str]]: - """Return supported fitting modes.""" - del filters - return [(mode.value, mode.description()) for mode in FitModeEnum] + """Return fitting modes applicable to the loaded project.""" + count = filters.get('experiment_count') + if count is None: + return [(mode.value, mode.description()) for mode in FitModeEnum] + applicable = [] + for mode in FitModeEnum: + ok = count >= _MINIMUM_JOINT_EXPERIMENTS if mode is FitModeEnum.JOINT else count == 1 + if ok: + applicable.append((mode.value, mode.description())) + return applicable diff --git a/src/easydiffraction/analysis/categories/sequential_fit/default.py b/src/easydiffraction/analysis/categories/sequential_fit/default.py index d74d03ef1..6f1e868b3 100644 --- a/src/easydiffraction/analysis/categories/sequential_fit/default.py +++ b/src/easydiffraction/analysis/categories/sequential_fit/default.py @@ -84,6 +84,15 @@ def __init__(self) -> None: cif_names=['_easydiffraction_sequential_fit.reverse'], ), ) + self._copy_data = BoolDescriptor( + name='copy_data', + description='Whether to copy matched data files into the project.', + value_spec=AttributeSpec(default=False), + tags=TagSpec( + edi_names=['_sequential_fit.copy_data'], + cif_names=['_easydiffraction_sequential_fit.copy_data'], + ), + ) @property def data_dir(self) -> StringDescriptor: @@ -135,6 +144,16 @@ def reverse(self, value: bool) -> None: """Set whether to process sequential-fit files in reverse.""" self._reverse.value = value + @property + def copy_data(self) -> BoolDescriptor: + """Whether to copy matched data files into the project.""" + return self._copy_data + + @copy_data.setter + def copy_data(self, value: bool) -> None: + """Set whether to copy matched data files into the project.""" + self._copy_data.value = value + @property def as_cif(self) -> str: """Return CIF representation of this sequential-fit category.""" diff --git a/src/easydiffraction/display/plotting.py b/src/easydiffraction/display/plotting.py index bc642e1db..cdea4b35e 100644 --- a/src/easydiffraction/display/plotting.py +++ b/src/easydiffraction/display/plotting.py @@ -919,10 +919,10 @@ def plot_param_series( """ Plot a parameter's value across sequential fit results. - When a ``results.csv`` file exists in the project's - ``analysis/`` directory, data is read from CSV. Otherwise, - falls back to in-memory parameter snapshots (produced by - ``fit()`` in single mode). + Data is read from the ``results.csv`` written by a sequential + fit in the project's ``analysis/`` directory. When no such file + exists, there is no parameter series to plot and a warning is + emitted. Parameters ---------- @@ -940,28 +940,25 @@ def plot_param_series( log.warning('Series plot target does not expose a CSV column name.') return - # Try CSV first (produced by fit_sequential or future fit) csv_path = None if self._project.metadata.path is not None: candidate = pathlib.Path(self._project.metadata.path) / 'analysis' / 'results.csv' if candidate.is_file(): csv_path = str(candidate) - if csv_path is not None: - self._plot_param_series_from_csv( - csv_path=csv_path, - column_names=column_names, - param_descriptor=param, - versus_path=versus, - ) - else: - # Fallback: in-memory snapshots from fit() single mode - self.plot_param_series_from_snapshots( - column_names[0], - versus, - self._project.experiments, - self._project.analysis._parameter_snapshots, + if csv_path is None: + log.warning( + 'No sequential results found to plot; run a sequential fit ' + 'to produce analysis/results.csv first.' ) + return + + self._plot_param_series_from_csv( + csv_path=csv_path, + column_names=column_names, + param_descriptor=param, + versus_path=versus, + ) @staticmethod def _series_column_names(param: object) -> list[str]: @@ -1032,9 +1029,8 @@ def plot_all_param_series( """ Plot every fitted parameter across sequential fit results. - Iterates the fitted parameters recorded in ``results.csv`` (or, - when absent, in the in-memory parameter snapshots) and emits one - ``plot_param_series`` plot per parameter. + Iterates the fitted parameters recorded in ``results.csv`` and + emits one ``plot_param_series`` plot per parameter. Parameters ---------- @@ -1060,7 +1056,7 @@ def plot_all_param_series( def _collect_fitted_parameter_unique_names(self) -> list[str]: """ - Return fitted parameter unique names from CSV or snapshots. + Return fitted parameter unique names from ``results.csv``. """ from easydiffraction.analysis.sequential import _META_COLUMNS # noqa: PLC0415 @@ -1072,21 +1068,17 @@ def _collect_fitted_parameter_unique_names(self) -> list[str]: if candidate.is_file(): csv_path = str(candidate) - if csv_path is not None: - df = pd.read_csv(csv_path) - return [ - column - for column in df.columns - if column not in meta - and not column.startswith('diffrn.') - and not column.endswith('.uncertainty') - ] - - snapshots = self._project.analysis._parameter_snapshots - if not snapshots: + if csv_path is None: return [] - first_snapshot = next(iter(snapshots.values())) - return list(first_snapshot.keys()) + + df = pd.read_csv(csv_path) + return [ + column + for column in df.columns + if column not in meta + and not column.startswith('diffrn.') + and not column.endswith('.uncertainty') + ] def _fitted_param_descriptors_by_unique_name(self) -> dict[str, object]: """Return descriptor map keyed by ``unique_name``.""" @@ -6197,80 +6189,6 @@ def _plot_param_series_from_csv( height=self.height, ) - def plot_param_series_from_snapshots( - self, - unique_name: str, - versus_path: str | None, - experiments: object, - parameter_snapshots: dict[str, dict[str, dict]], - ) -> None: - """ - Plot a parameter's value from in-memory snapshots. - - This is a backward-compatibility method used when no CSV file is - available (e.g. after ``fit()`` in single mode, before PR 13 - adds CSV output to the existing fit loop). - - Parameters - ---------- - unique_name : str - Unique name of the parameter to plot. - versus_path : str | None - Persisted diffrn path for the x-axis. - experiments : object - Experiments collection for accessing diffrn conditions. - parameter_snapshots : dict[str, dict[str, dict]] - Per-experiment parameter value snapshots. - """ - x = [] - y = [] - sy = [] - axes_labels = [] - title = '' - - for idx, expt_name in enumerate(parameter_snapshots, start=1): - experiment = experiments[expt_name] - diffrn = experiment.diffrn - - x_axis_param = self._resolve_diffrn_descriptor( - diffrn, - self._versus_field_name(versus_path), - ) - - if x_axis_param is not None and x_axis_param.value is not None: - value = x_axis_param.value - else: - value = idx - x.append(value) - - param_data = parameter_snapshots[expt_name][unique_name] - y.append(param_data['value']) - sy.append(param_data['uncertainty']) - - if x_axis_param is not None: - axes_labels = [ - self._versus_axis_label(versus_path, x_axis_param), - f'Parameter value ({param_data["units"]})', - ] - else: - axes_labels = [ - 'Experiment No.', - f'Parameter value ({param_data["units"]})', - ] - - title = f"Parameter '{unique_name}' across fit results" - - x, y, sy = self._order_series_by_x(x, y, sy) - - self._backend.plot_scatter( - x=x, - y=y, - sy=sy, - axes_labels=axes_labels, - title=title, - height=self.height, - ) - @staticmethod def _resolve_diffrn_descriptor( diffrn: object, diff --git a/tests/integration/fitting/test_analysis_and_fit_category_support.py b/tests/integration/fitting/test_analysis_and_fit_category_support.py index 1e4ce55a6..ea2f92fbc 100644 --- a/tests/integration/fitting/test_analysis_and_fit_category_support.py +++ b/tests/integration/fitting/test_analysis_and_fit_category_support.py @@ -93,13 +93,15 @@ def test_fitting_mode_show_supported_for_single_and_multiple_experiments(capsys) out_single = capsys.readouterr().out assert 'Fitting Mode types' in out_single assert 'single' in out_single - assert 'joint' in out_single + assert 'sequential' in out_single + assert 'joint' not in out_single multi = Analysis(project=_make_project_with_names(['e1', 'e2'])) multi.fitting_mode.show_supported() out_multi = capsys.readouterr().out assert 'joint' in out_multi - assert 'sequential' in out_multi + assert 'single' not in out_multi + assert 'sequential' not in out_multi def test_minimizer_show_supported_prints(capsys): @@ -218,8 +220,7 @@ class FakeConstraint: assert captured['columns_data'][0] == ['constraint_1', 'x = y + 1'] -def test_discover_helpers_and_snapshot_params(): - from easydiffraction.analysis.analysis import Analysis +def test_discover_helpers(): from easydiffraction.analysis.analysis import _discover_method_rows from easydiffraction.analysis.analysis import _discover_property_rows @@ -252,18 +253,3 @@ def _private(self): assert next(row for row in property_rows if row[0] == 'beta')[1] == '✓' assert 'do_thing()' in [row[0] for row in method_rows] assert '_private()' not in [row[0] for row in method_rows] - - analysis = Analysis(project=_make_project()) - - class FakeParam: - unique_name = 'p1' - value = 1.23 - uncertainty = 0.01 - units = 'A' - - class FakeResults: - parameters = [FakeParam()] - - analysis._snapshot_params('expt1', FakeResults()) - assert analysis._parameter_snapshots['expt1']['p1']['value'] == 1.23 - assert analysis._parameter_snapshots['expt1']['p1']['uncertainty'] == 0.01 diff --git a/tests/integration/fitting/test_powder-diffraction_joint-fit.py b/tests/integration/fitting/test_powder-diffraction_joint-fit.py index 084bae298..bcc9f5121 100644 --- a/tests/integration/fitting/test_powder-diffraction_joint-fit.py +++ b/tests/integration/fitting/test_powder-diffraction_joint-fit.py @@ -262,21 +262,10 @@ def test_joint_fit_neutron_xray_pd_cwl_pbso4() -> None: expt1.linked_structures['pbso4'].scale.free = True expt2.linked_structures['pbso4'].scale.free = True - # ------------ 1st fitting ------------ + # ------------ 1st fitting (joint, default weights) ------------ - # Perform fit - project.analysis.fit() - - # Compare fit quality - assert_almost_equal( - project.analysis.fit_results.reduced_chi_square, - desired=26.05, - decimal=1, - ) - - # ------------ 2nd fitting ------------ - - # Perform fit + # Perform fit. With two loaded experiments only joint fitting + # applies (single is restricted to one experiment). project.analysis.fitting_mode.type = 'joint' project.analysis.fit() @@ -287,7 +276,7 @@ def test_joint_fit_neutron_xray_pd_cwl_pbso4() -> None: decimal=1, ) - # ------------ 3rd fitting ------------ + # ------------ 2nd fitting (joint, explicit equal weights) ------------ # Perform fit project.analysis.joint_fit['xrd'].weight = 0.5 # Default @@ -301,7 +290,7 @@ def test_joint_fit_neutron_xray_pd_cwl_pbso4() -> None: decimal=1, ) - # ------------ 4th fitting ------------ + # ------------ 3rd fitting (joint, asymmetric weights) ------------ # Perform fit project.analysis.joint_fit['xrd'].weight = 0.3 diff --git a/tests/integration/fitting/test_sequential.py b/tests/integration/fitting/test_sequential.py index 08648c2aa..7ee9681f0 100644 --- a/tests/integration/fitting/test_sequential.py +++ b/tests/integration/fitting/test_sequential.py @@ -280,27 +280,39 @@ def test_fit_sequential_requires_saved_project(tmp_path) -> None: project.structures.add(model) project.experiments.add(expt) + # A matching data file so source resolution passes and the + # unsaved-project precondition is reached. + (tmp_path / 'scan.xye').write_text('1 2 3\n') + with pytest.raises(ValueError, match='must be saved'): _run_sequential_fit(project, str(tmp_path)) def test_fit_sequential_requires_at_least_one_structure(tmp_path) -> None: """fit_sequential raises if no structures exist.""" + data_path = download_data('meas-lbco-hrpt', destination=TEMP_DIR) + expt = ExperimentFactory.from_data_path(name='e', data_path=data_path) project = Project(name='no_struct') + project.experiments.add(expt) project.save_as(str(tmp_path / 'proj')) + # One loaded experiment (so sequential applies) and a matching data + # file (so source resolution passes) make the missing structure the + # first precondition to fail. + (tmp_path / 'scan.xye').write_text('1 2 3\n') + with pytest.raises(ValueError, match='at least 1 structure'): _run_sequential_fit(project, str(tmp_path)) def test_fit_sequential_requires_one_experiment(tmp_path) -> None: - """fit_sequential raises if no experiments exist.""" + """Sequential mode does not apply with zero loaded experiments.""" model = StructureFactory.from_scratch(name='s') project = Project(name='no_expt') project.structures.add(model) project.save_as(str(tmp_path / 'proj')) - with pytest.raises(ValueError, match='exactly 1 experiment'): + with pytest.raises(ValueError, match="Fit mode 'sequential' does not apply"): _run_sequential_fit(project, str(tmp_path)) diff --git a/tests/unit/easydiffraction/analysis/categories/fitting_mode/test_default.py b/tests/unit/easydiffraction/analysis/categories/fitting_mode/test_default.py index f6461774b..950fb9d39 100644 --- a/tests/unit/easydiffraction/analysis/categories/fitting_mode/test_default.py +++ b/tests/unit/easydiffraction/analysis/categories/fitting_mode/test_default.py @@ -50,6 +50,19 @@ def test_fitting_mode_supported_types_include_all_modes(): assert tags == ['single', 'joint', 'sequential'] +def test_fitting_mode_supported_types_by_experiment_count(): + from easydiffraction.analysis.categories.fitting_mode.default import FittingMode + + def tags_for(count): + supported = FittingMode._supported_types({'experiment_count': count}) + return [tag for tag, _description in supported] + + assert tags_for(0) == [] + assert tags_for(1) == ['single', 'sequential'] + assert tags_for(2) == ['joint'] + assert tags_for(5) == ['joint'] + + def test_fitting_mode_from_cif_restores_value_without_parent(): from easydiffraction.analysis.categories.fitting_mode.default import FittingMode diff --git a/tests/unit/easydiffraction/analysis/categories/sequential_fit/test_default.py b/tests/unit/easydiffraction/analysis/categories/sequential_fit/test_default.py index 79c1c1fea..3847b941e 100644 --- a/tests/unit/easydiffraction/analysis/categories/sequential_fit/test_default.py +++ b/tests/unit/easydiffraction/analysis/categories/sequential_fit/test_default.py @@ -13,6 +13,7 @@ def test_sequential_fit_defaults(): assert sequential_fit.max_workers.value == '1' assert sequential_fit.chunk_size.value == '.' assert sequential_fit.reverse.value is False + assert sequential_fit.copy_data.value is False assert sequential_fit._identity.category_code == 'sequential_fit' @@ -25,6 +26,7 @@ def test_sequential_fit_as_cif_serializes_all_fields(): sequential_fit.max_workers = 'auto' sequential_fit.chunk_size = '4' sequential_fit.reverse = True + sequential_fit.copy_data = True as_cif = sequential_fit.as_cif @@ -33,3 +35,4 @@ def test_sequential_fit_as_cif_serializes_all_fields(): assert '_sequential_fit.max_workers auto' in as_cif assert '_sequential_fit.chunk_size 4' in as_cif assert '_sequential_fit.reverse true' in as_cif.lower() + assert '_sequential_fit.copy_data true' in as_cif.lower() diff --git a/tests/unit/easydiffraction/analysis/test_analysis.py b/tests/unit/easydiffraction/analysis/test_analysis.py index 91c406d5c..5aeb8f6a9 100644 --- a/tests/unit/easydiffraction/analysis/test_analysis.py +++ b/tests/unit/easydiffraction/analysis/test_analysis.py @@ -471,7 +471,7 @@ def __exit__(self, exc_type, exc_value, traceback) -> None: del traceback events.append(exc_type) - analysis = Analysis(project=_make_project_with_names([])) + analysis = Analysis(project=_make_project_with_names(['e1'])) analysis.project.verbosity = SimpleNamespace(fit=SimpleNamespace(value='full')) analysis.fit_results = object() analysis.fitter.results = object() @@ -837,7 +837,6 @@ def fake_update_short_table( monkeypatch.setattr( 'easydiffraction.analysis.analysis.make_display_handle', fake_make_display_handle ) - monkeypatch.setattr(analysis, '_snapshot_params', lambda expt_name, results: None) monkeypatch.setattr(analysis.fitter, 'fit', fake_fit) monkeypatch.setattr(Analysis, '_fit_single_update_short_table', fake_update_short_table) @@ -898,7 +897,7 @@ def fake_fit_sequential( analysis, '_update_categories', lambda: calls.append(('update_categories', None)) ) monkeypatch.setattr( - analysis, '_resolve_sequential_data_dir', lambda: tmp_path / 'resolved-scans' + analysis, '_resolve_sequential_source', lambda: str(tmp_path / 'resolved-scans') ) analysis.fit_results = object() analysis.fitter.results = object() diff --git a/tests/unit/easydiffraction/analysis/test_analysis_coverage.py b/tests/unit/easydiffraction/analysis/test_analysis_coverage.py index 0ef1e5313..6cef62060 100644 --- a/tests/unit/easydiffraction/analysis/test_analysis_coverage.py +++ b/tests/unit/easydiffraction/analysis/test_analysis_coverage.py @@ -219,32 +219,6 @@ def test_setter_changes_minimizer(self, capsys): # ------------------------------------------------------------------ -class TestSnapshotParams: - def test_snapshot_stores_values(self): - from easydiffraction.analysis.analysis import Analysis - - a = Analysis(project=_make_project()) - - class FakeParam: - unique_name = 'p1' - value = 1.23 - uncertainty = 0.01 - units = 'angstroms' - - def resolve_display_units(self, context): - assert context == 'gui' - return 'Å' - - class FakeResults: - parameters = [FakeParam()] - - a._snapshot_params('expt1', FakeResults()) - assert 'expt1' in a._parameter_snapshots - assert a._parameter_snapshots['expt1']['p1']['value'] == 1.23 - assert a._parameter_snapshots['expt1']['p1']['uncertainty'] == 0.01 - assert a._parameter_snapshots['expt1']['p1']['units'] == 'Å' - - class TestBayesianProjection: def test_single_parameter_projection_persists_distribution_and_predictive_caches(self): from easydiffraction.analysis.analysis import Analysis @@ -2735,3 +2709,152 @@ def test_update_categories_applies_enabled_constraints(self, monkeypatch): a._update_categories() assert applied == [True] + + +# ------------------------------------------------------------------ +# Fit-mode applicability (loaded-experiment count) +# ------------------------------------------------------------------ + + +def _analysis_with_experiment_names(names, *, path=None): + from easydiffraction.analysis.analysis import Analysis + + project = SimpleNamespace( + experiments=SimpleNamespace(names=list(names)), + structures=object(), + metadata=SimpleNamespace(path=path), + _varname='proj', + ) + return Analysis(project=project) + + +class TestModeApplicability: + def test_supported_filters_passes_experiment_count(self): + a = _analysis_with_experiment_names(['e1', 'e2']) + assert a._supported_filters_for(a.fitting_mode) == {'experiment_count': 2} + + def test_supported_filters_empty_for_other_categories(self): + a = _analysis_with_experiment_names(['e1']) + assert a._supported_filters_for(a.minimizer) == {} + + def test_single_requires_exactly_one_experiment(self): + import pytest + + from easydiffraction.analysis.enums import FitModeEnum + + a = _analysis_with_experiment_names(['e1', 'e2']) + with pytest.raises(ValueError, match="Fit mode 'single' does not apply"): + a._require_mode_applicable(FitModeEnum.SINGLE) + + def test_joint_requires_two_or_more_experiments(self): + import pytest + + from easydiffraction.analysis.enums import FitModeEnum + + a = _analysis_with_experiment_names(['e1']) + with pytest.raises(ValueError, match="Fit mode 'joint' does not apply"): + a._require_mode_applicable(FitModeEnum.JOINT) + + def test_applicable_modes_do_not_raise(self): + from easydiffraction.analysis.enums import FitModeEnum + + one = _analysis_with_experiment_names(['e1']) + one._require_mode_applicable(FitModeEnum.SINGLE) + one._require_mode_applicable(FitModeEnum.SEQUENTIAL) + + two = _analysis_with_experiment_names(['e1', 'e2']) + two._require_mode_applicable(FitModeEnum.JOINT) + + +# ------------------------------------------------------------------ +# Sequential source resolution + copy_data +# ------------------------------------------------------------------ + + +class TestResolveSequentialSource: + def _saved_analysis(self, tmp_path): + return _analysis_with_experiment_names(['e1'], path=tmp_path) + + def test_unset_data_dir_raises(self, tmp_path): + import pytest + + a = self._saved_analysis(tmp_path) + with pytest.raises(ValueError, match='needs a data folder'): + a._resolve_sequential_source() + + def test_no_matching_files_raises(self, tmp_path): + import pytest + + a = self._saved_analysis(tmp_path) + src = tmp_path / 'scans' + src.mkdir() + a.sequential_fit.data_dir.value = str(src) + a.sequential_fit.file_pattern.value = '*.xye' + with pytest.raises(ValueError, match='No sequential data files found'): + a._resolve_sequential_source() + + def test_without_copy_returns_source(self, tmp_path): + a = self._saved_analysis(tmp_path) + src = tmp_path / 'scans' + src.mkdir() + (src / 'a.xye').write_text('1\n') + a.sequential_fit.data_dir.value = str(src) + a.sequential_fit.file_pattern.value = '*.xye' + assert a._resolve_sequential_source() == str(src) + + def test_copy_data_archives_and_rewrites_data_dir(self, tmp_path): + a = self._saved_analysis(tmp_path) + src = tmp_path / 'ext' + src.mkdir() + (src / 'a.xye').write_text('1\n') + (src / 'b.xye').write_text('2\n') + a.sequential_fit.data_dir.value = str(src) + a.sequential_fit.file_pattern.value = '*.xye' + a.sequential_fit.copy_data.value = True + + dest = tmp_path / 'data' / 'sequential' + assert a._resolve_sequential_source() == str(dest) + assert sorted(p.name for p in dest.iterdir()) == ['a.xye', 'b.xye'] + assert a.sequential_fit.data_dir.value == 'data/sequential' + + def test_copy_data_refresh_drops_stale_files(self, tmp_path): + a = self._saved_analysis(tmp_path) + dest = tmp_path / 'data' / 'sequential' + dest.mkdir(parents=True) + (dest / 'stale.xye').write_text('old\n') + src = tmp_path / 'ext' + src.mkdir() + (src / 'a.xye').write_text('1\n') + a.sequential_fit.data_dir.value = str(src) + a.sequential_fit.file_pattern.value = '*.xye' + a.sequential_fit.copy_data.value = True + + a._resolve_sequential_source() + assert sorted(p.name for p in dest.iterdir()) == ['a.xye'] + + def test_copy_data_idempotent_self_copy(self, tmp_path): + a = self._saved_analysis(tmp_path) + dest = tmp_path / 'data' / 'sequential' + dest.mkdir(parents=True) + (dest / 'a.xye').write_text('1\n') + a.sequential_fit.data_dir.value = str(dest) + a.sequential_fit.file_pattern.value = '*.xye' + a.sequential_fit.copy_data.value = True + + assert a._resolve_sequential_source() == str(dest) + assert (dest / 'a.xye').exists() + + def test_copy_data_rejects_nested_source(self, tmp_path): + import pytest + + a = self._saved_analysis(tmp_path) + nested = tmp_path / 'data' / 'sequential' / 'raw' + nested.mkdir(parents=True) + (nested / 'a.xye').write_text('1\n') + a.sequential_fit.data_dir.value = str(nested) + a.sequential_fit.file_pattern.value = '*.xye' + a.sequential_fit.copy_data.value = True + + with pytest.raises(ValueError, match='separate from the managed archive'): + a._resolve_sequential_source() + assert (nested / 'a.xye').exists() diff --git a/tests/unit/easydiffraction/display/test_plotting_coverage.py b/tests/unit/easydiffraction/display/test_plotting_coverage.py index cec75b42c..8bdb8244e 100644 --- a/tests/unit/easydiffraction/display/test_plotting_coverage.py +++ b/tests/unit/easydiffraction/display/test_plotting_coverage.py @@ -418,73 +418,6 @@ class ParamDesc: assert 'Experiment No.' in plot_calls[0]['axes_labels'] -# ------------------------------------------------------------------ -# Plotter.plot_param_series_from_snapshots (public method) -# ------------------------------------------------------------------ - - -class TestPlotParamSeriesFromSnapshots: - def test_snapshot_plot(self): - from easydiffraction.display.plotting import Plotter - - plot_calls = [] - - class FakeBackend: - def plot_scatter(self, **kwargs): - plot_calls.append(kwargs) - - class Diffrn: - ambient_temperature = type( - 'T', (), {'value': 300, 'description': 'Temp', 'name': 'ambient_temperature'} - )() - - class Expt: - diffrn = Diffrn() - - p = Plotter() - p._backend = FakeBackend() - experiments = {'expt1': Expt()} - snapshots = { - 'expt1': { - 'param_a': {'value': 1.23, 'uncertainty': 0.01, 'units': 'Å'}, - }, - } - p.plot_param_series_from_snapshots( - 'param_a', 'diffrn.ambient_temperature', experiments, snapshots - ) - assert len(plot_calls) == 1 - assert plot_calls[0]['y'] == [1.23] - assert plot_calls[0]['x'] == [300] - - def test_snapshot_plot_no_versus(self): - from easydiffraction.display.plotting import Plotter - - plot_calls = [] - - class FakeBackend: - def plot_scatter(self, **kwargs): - plot_calls.append(kwargs) - - class Diffrn: - pass - - class Expt: - diffrn = Diffrn() - - p = Plotter() - p._backend = FakeBackend() - experiments = {'expt1': Expt()} - snapshots = { - 'expt1': { - 'param_a': {'value': 2.0, 'uncertainty': 0.05, 'units': 'Å'}, - }, - } - p.plot_param_series_from_snapshots('param_a', None, experiments, snapshots) - assert len(plot_calls) == 1 - assert plot_calls[0]['x'] == [1] # fallback to index - assert 'Experiment No.' in plot_calls[0]['axes_labels'] - - # ------------------------------------------------------------------ # Plotter public methods (plot_meas, plot_calc, plot_meas_vs_calc) # ------------------------------------------------------------------ @@ -680,35 +613,25 @@ def test_warns_when_no_column_name(self, monkeypatch, capsys): out = capsys.readouterr().out assert 'does not expose a CSV column name' in out - def test_falls_back_to_snapshots_without_csv(self, monkeypatch): + def test_warns_when_no_csv(self, monkeypatch, capsys): from types import SimpleNamespace from easydiffraction.display.plotting import Plotter + from easydiffraction.utils.logging import Logger - captured = {} - - class FakeAnalysis: - _parameter_snapshots = {'expt1': {'param_a': {}}} + monkeypatch.setattr(Logger, '_reaction', Logger.Reaction.WARN, raising=True) class FakeProject: metadata = SimpleNamespace(path=None) experiments = {'expt1': object()} - analysis = FakeAnalysis() + analysis = SimpleNamespace() p = Plotter() p._set_project(FakeProject()) - - def fake_snapshots(unique_name, versus, experiments, snapshots): - captured['unique_name'] = unique_name - captured['versus'] = versus - captured['snapshots'] = snapshots - - p.plot_param_series_from_snapshots = fake_snapshots p.plot_param_series(SimpleNamespace(unique_name='param_a', name='param_a'), versus='v') - assert captured['unique_name'] == 'param_a' - assert captured['versus'] == 'v' - assert captured['snapshots'] == {'expt1': {'param_a': {}}} + out = capsys.readouterr().out + assert 'No sequential results found' in out def test_uses_csv_when_results_file_present(self, monkeypatch, tmp_path): from types import SimpleNamespace @@ -822,19 +745,6 @@ class FakeProject: p._set_project(FakeProject()) assert p._collect_fitted_parameter_unique_names() == ['param_a'] - def test_from_snapshots_when_no_csv(self): - from types import SimpleNamespace - - from easydiffraction.display.plotting import Plotter - - class FakeProject: - metadata = SimpleNamespace(path=None) - analysis = SimpleNamespace(_parameter_snapshots={'e1': {'param_a': {}, 'param_b': {}}}) - - p = Plotter() - p._set_project(FakeProject()) - assert p._collect_fitted_parameter_unique_names() == ['param_a', 'param_b'] - def test_empty_when_no_csv_and_no_snapshots(self): from types import SimpleNamespace