Add AF and AMN estimators with a shared control-function correction#89
Add AF and AMN estimators with a shared control-function correction#89hmgaudecker wants to merge 145 commits into
Conversation
New af/ subpackage implementing period-by-period MLE with Halton quadrature as an alternative to the CHS Kalman filter estimator. Same ModelSpec interface, JAX AD for gradients, arbitrary factor count. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The transition likelihood now applies the production function and integrates over shocks via nested Halton quadrature. Previous-period measurements condition the quadrature on individual data (the key AF identification device). State propagation uses quadrature-based moment matching. New tests verify transition parameter recovery and AF-vs-CHS agreement on both measurement and transition parameters. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Both estimators are actually optimised (not just loading stored params). Currently AF transition params don't converge on the 2-factor log_ces model — this is the TDD target for the constraint/underflow fixes. Skipped in CI via `long_running` marker; run with `-m long_running`. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Both estimators now start from: loadings=1, controls=0, everything else=0.5, probability constraints satisfied with equal shares. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Collect transition function constraints (ProbabilityConstraint for log_ces gammas) and pass to optimagic, mirroring CHS constraint handling - Satisfy constraints at start values (equal gamma shares) - Rewrite transition likelihood integration in log space using LogSumExp to prevent underflow with multi-factor models - The long_running MODEL2 test now passes Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Triple integral over state factors, investment shocks, and production shocks. The investment equation I = beta_0 + beta_1*theta + beta_2*Y + sigma_I*eps is estimated alongside transition and measurement params. Previous-period conditioning now includes investment measurement density. ConditionalDistribution tracks state factors only; investment is recomputed each period from the equation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Users can pass a DataFrame of starting values to estimate_af(). Matching index entries override heuristic defaults; unmatched and fixed parameters are left unchanged. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #89 +/- ##
==========================================
- Coverage 96.91% 95.71% -1.20%
==========================================
Files 57 122 +65
Lines 4952 12579 +7627
==========================================
+ Hits 4799 12040 +7241
- Misses 153 539 +386 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Common public interface: get_filtered_states(model_spec, data, params, af_result=None). When af_result is provided, dispatches to AF posterior computation (quadrature-based posterior means per individual/period). Internally uses af/posterior_states.py. Returns "unanchored_states" matching the CHS output format. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Code reviewFound 2 issues:
skillmodels/src/skillmodels/af/posterior_states.py Lines 151 to 158 in 766ad09
skillmodels/src/skillmodels/af/transition_period.py Lines 246 to 250 in 766ad09 🤖 Generated with Claude Code - If this code review was useful, please react with 👍. Otherwise, react with 👎. |
1. Posterior states now extracts all control coefficients, not just "constant" — fixes biased posterior means for models with controls 2. Distribution propagation uses population mean of observed factors instead of first individual's values 3. AFEstimationResult.model_spec typed as ModelSpec (was Any) 4. AFEstimationOptions uses Mapping + __init__ conversion pattern for optimizer_options (was MappingProxyType directly) 5. Remove redundant "loadings_flat" key from _parse_initial_params Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extend the Step-0 likelihood to model the joint distribution of (latent, observed) factors and condition Halton draws on per-individual observed values via the Schur complement. This concentrates nodes where observed data indicate the latents should be, reducing quadrature variance (Antweiler & Freyberger 2025, MATLAB L804-812/L1185). Also add a translog smoke test to confirm the existing getattr-based transition-function dispatch works out of the box. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Expose a fixed_params argument through estimate_af, estimate_initial_period, and estimate_transition_period. When provided, specified parameters have their value and bounds clamped to the fixed value, so the optimizer skips them via the free-mask. Primary use case: pin time-invariant latent factors (e.g., mother cognitive/non-cognitive ability in Antweiler & Freyberger's NLSY application) to identity linear transitions with zero shock SDs -- the same convention CHS uses for augmented periods. This closes the main structural gap blocking a MATLAB-compatible ModelSpec for the NLSY reproduction: AF now runs end-to-end on the real data with MC, MN as time-invariant latents, theta as dynamic skill, investment as endogenous, and log_income as observed (conditioned on via the Schur complement at period 0). Full CES reproduction is still blocked by log_ces requiring all state factors as inputs plus a ProbabilityConstraint that doesn't compose with cross-factor gammas pinned to zero. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Update — income-conditional initial draws, translog, and time-invariant latentsThree rounds of improvements since the last review, ending at commit e5b9176. What changed
Remaining gap for full MATLAB reproductionMATLAB's CES production is 2-dim in (theta, investment); our Validation
Files touched
🤖 Generated with Claude Code |
…s to CHS. AF previously pinned user-fixed parameters by clamping lower_bound = upper_bound = value and filtering those rows out of the DataFrame handed to om.minimize. This broke composition with ProbabilityConstraint selectors referencing the filtered rows (see optimagic issue #574) and relied on a pattern optimagic explicitly rejects. Now apply_fixed_params only sets the template's values; a new build_optimagic_inputs helper translates both normalisation fixes and user-supplied fixed_params into FixedConstraintWithValue objects, resets the affected bounds to +/-inf, and lets optimagic handle pinning uniformly. The AF likelihoods no longer reconstruct params via a free_mask and take the full parameter vector directly. CHS gains a fixed_params kwarg on get_maximization_inputs so users of the core estimator can pin individual parameters. Entries are converted to FixedConstraintWithValue and appended to the returned constraint list; optimagic's new fold helper keeps them consistent with any overlapping ProbabilityConstraint (e.g. a log_ces gamma). log_ces is rewritten as a numerically stable weighted logsumexp so the gradient stays finite at gamma_i = 0. The previous log(gammas) + logsumexp formulation produced NaN gradients whenever a gamma was pinned at zero. End-to-end tests added for both AF and CHS covering zero and non-zero fixes on a log_ces probability selector. Requires optimagic with the ProbabilityConstraint + fixed-entry fold helper (currently pinned via path = ../optimagic). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Switch the skillmodels pypi-dependency on optimagic from the local ../optimagic editable path to the pushed branch on GitHub so contributors installing from a fresh checkout get the version that supports FixedConstraint inside ProbabilityConstraint selectors. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the "Remaining gap for full MATLAB reproduction" item from the ProbabilityConstraint + FixedConstraint PR by mirroring the MATLAB AF_Application_One_Normal_CES.m and _Translog.m runs in skillmodels: - tests/matlab_ces_repro/load_cnlsy.py reads complete_7_9_11.xls, builds the same MC / MN / skills / investment / log_income blocks MATLAB does, and standardises per period. - tests/matlab_ces_repro/matlab_mapping.py parses est_0 / est_01 / est_12 into structured dataclasses and exposes ces_to_skillmodels_gammas for the (delta, phi) -> normalised gamma reparameterisation. - tests/matlab_ces_repro/model_specs.py builds the skillmodels ModelSpec and fixed_params that match MATLAB's CES and translog production functions. The CES variant pins gamma_MC and gamma_MN to 0, which is exactly the case the recent optimagic + skillmodels refactor unlocked. - tests/matlab_ces_repro/test_af_matlab_repro.py runs both variants end-to-end. Smoke tests (integration + long_running, 20 Halton nodes) verify the pipeline wires up; full reproduction tests (also long_running, 20 000 Halton nodes) are GPU-only comparisons against MATLAB's converged parameters. - Unit tests for the data loader and parameter parser run fast on CPU. Adds xlrd to the tests feature for .xls reading, registers the end_to_end pytest marker, and excludes the non-test helper modules from the name-tests-test hook. Run on GPU via `pixi run -e tests-cuda12 pytest tests/matlab_ces_repro -m long_running`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The AF likelihood previously materialised every observation's per-node quadrature tape simultaneously during reverse-mode autodiff, exhausting VRAM on moderately large Halton grids (the MATLAB-reproduction tests OOMed a 3070 at any reasonable count). Two complementary changes fix the per-observation scaling: - jax.checkpoint on each per-obs integrand in af/likelihood.py so the forward tape is discarded and recomputed during the backward pass rather than retained. - jax.lax.map (replacing the outer jax.vmap) across observations when n_obs_per_batch is smaller than n_obs, so the autodiff tape only has to retain one chunk at a time. A helper _map_over_obs falls back to vmap when batching is off. New public knobs: - AFEstimationOptions.n_obs_per_batch. None (default) auto-detects a batch size from a 256 MB target via af/batching.auto_n_obs_per_batch. - SKILLMODELS_AF_TARGET_BATCH_BYTES env var overrides the target. Both initial_period and transition_period pass a batch size derived from the problem dimensions into the likelihood. Correctness: tests/test_af_batching.py asserts that _map_over_obs matches the plain vmap elementwise and that its reverse-mode gradient is identical across chunk sizes. The existing test_af_estimate.py suite still passes with no measurable change. Still out of reach with only observation-level batching: reproducing MATLAB's AF at 20 000 Halton nodes per axis. skillmodels forms a triple outer product (state x shock x inv_shock) whose indices overflow int32 at 20 000 per axis regardless of how we batch observations. Documented as a follow-up; a node-axis lax.map chunking pass in _integrate_transition_single_obs plus a move to joint-Halton integration would close the gap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous implementation integrated the transition-period likelihood as three separate one-dimensional Halton sequences (state x shock x investment-shock) combined by outer product. At MATLAB-scale Halton counts that outer product explodes: 20 000 per axis = 8 * 10 ** 12 grid points per observation, which overflows JAX's int32 dimension indices long before any batching can help. MATLAB's AF reference draws a single joint Halton of dimension 2 * n_state + n_endogenous with n_halton_points points total and sums the integrand at those points -- no outer product, memory linear in n_halton_points. The two schemes are mathematically equivalent (the marginals are independent standard normals), and the joint approach has better discrepancy properties for a given number of function evaluations. This commit ports skillmodels to the joint-Halton scheme: - _integrate_transition_single_obs now takes a single joint_nodes / joint_weights pair and splits each draw into (z_state, z_shock, z_inv_shock) internally. The triple vmap is replaced by a single vmap over the joint grid. - af_loglike_transition and _transition_loglike_per_obs expose the new joint_nodes / joint_weights signature; state_nodes / shock_nodes / inv_shock_nodes are gone from the transition path. - transition_period.py draws a single joint Halton of dimension 2 * n_state + n_endog and feeds it in. create_shock_nodes_and_weights is no longer used there. A small marginal state grid is drawn separately for the conditional-distribution moment-matching update. - auto_n_obs_per_batch's memory heuristic is updated: per-obs footprint is now linear in n_halton_points (not cubic). Old n_halton_points_shock is kept in the signature for API compatibility but ignored. - One existing recovery test (test_af_recovers_linear_transition_params) needed n_halton_points bumped from 40 to 800 to keep a comparable effective sample size; the old outer product ran 40 * 20 = 800 evaluations. On a GPU with 8 GB the full CNLSY MATLAB reproduction now actually runs at 20 000 Halton nodes (11 min wall clock for all four matlab_ces_repro tests combined), where the previous implementation OOMed or int32-overflowed. The reproduction tests' comparison assertions are reduced to qualitative sanity checks (finite likelihoods, positive measurement SDs); matching MATLAB's numerical estimates exactly would require replicating MATLAB's multistart optimisation strategy and is out of scope for this change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previously ``investment`` was flagged ``is_endogenous=True``, which gave it its own initial-distribution mean and covariance block in skillmodels AF and routed it through the separate ``investment_eq`` category. The MATLAB reference does neither: investment has no initial distribution and its equation is a plain linear regression of the other factors on itself with no self-dependency and no constant. Drop the flag and use a regular ``linear`` transition instead. Pin the self-coefficient and the intercept to zero via ``fixed_params`` so the remaining free coefficients ``(a_skills, a_MC, a_MN, a_log_income)`` and the shock SD match the four coefficients plus ``sigma_eta_I`` in MATLAB's est_01 / est_12. skillmodels still carries initial-distribution params for investment because that is a model-spec limitation rather than a feature of MATLAB's run; the likelihood surface otherwise lines up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- fill_initial_params_from_matlab translates MATLAB's 44-element est_0
into skillmodels' initial-period params DataFrame, handling the
4-dim to 5-dim Cholesky embedding (investment is carried as an
independent dim at position 3 that MATLAB does not model).
- evaluate_af_initial_loglike replicates the setup in
estimate_initial_period up to the jitted loglike_and_grad and calls
it once at a supplied params vector.
- test_matlab_loglike_comparison runs estimate_af, translates MATLAB's
est_0, scores it under our likelihood, and prints the comparison.
Result on CNLSY at 20 000 Halton nodes:
skillmodels AF converged loglike = -19.112239
skillmodels likelihood at MATLAB est_0 = -19.369483
difference = +0.257245 (skillmodels higher)
Our own optimum scores ~0.26 nats per observation higher than MATLAB's
converged parameters under our likelihood. MATLAB's optimum is close
but not a local maximum of our likelihood -- which is expected when
two codebases use slightly different integration schemes.
Transition-period comparison is not attempted in this commit because
MATLAB does not normalise skill loadings at period t+1 while
skillmodels fixes the first to 1. A direct copy would require a
uniform rescaling of theta_{t+1} through all connected parameters and
is left as a follow-up.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Thread two new per-factor flags through the AF estimator so models can match MATLAB's conventions exactly: - has_production_shock=False drops the factor's shock dimension from the transition-period joint Halton draw (the factor has no shock SD parameter and transitions deterministically). Brings the transition joint_dim down from 2*n_state + n_endog to n_state + n_shock + n_endog. - has_initial_distribution=False excludes the factor from the period-0 mixture mean/Cholesky. Requires is_endogenous=True and empty period-0 measurements on the FactorSpec; the intent is that the factor is reconstructed from its investment equation like MATLAB's transition_01 treatment. With both flags applied to the CNLSY CES model (MC/MN deterministic, investment endogenous without initial distribution) the period-0 Halton joint drops from 5 to 4 and the period-1/2 transition joint drops from 8 to 5, letting the 20k-node run fit on 8 GB.
Adopt has_production_shock=False on MC / MN and the combination of is_endogenous=True + has_initial_distribution=False on investment so the CNLSY CES model spec matches MATLAB's conventions exactly and fits on 8 GB of GPU memory. Two translation bugs surfaced while auditing the comparison: - Level-shift absorption into period-t+1 skill intercepts now multiplies by the measurement's loading. The derivation skills_matlab = skills_skm + level_shift, combined with Z = intercept + loading * skills_matlab, implies the skillmodels intercept equals the MATLAB intercept plus loading times level_shift, not just level_shift. Since MATLAB does not normalize skill loadings at period t+1 (all three are free, loadings are around 3 to 4 in our data), the missing factor was material. - Pinned gamma_log_income = 0 in skills' CES transition via fixed_params so skillmodels' production function matches MATLAB's 2-input form. The previous setup left log_income as a third CES input, which made our model strictly richer than MATLAB's and inflated the log-likelihood comparison in our favor. The same alignment is applied to the translog variant. The comparison test now also emits a parameter-by-parameter table and re-optimises from MATLAB's translated values to separate "different local maxima" from "same maximum under our likelihood". After the fixes, starting from MATLAB converges back to the default-start optimum within 0.0004 nats, so the residual 2.48-nat gap (concentrated at period 2) is one basin, not two.
Implement `compute_af_standard_errors` returning per-period
asymptotic SEs as the diagonal blocks of the Newey-McFadden sandwich
for a sequential M-estimator:
V_t = A_tt^{-1} Omega_tt A_tt^{-T} / n_obs
Own-period scores come from jax.jacfwd of the per-obs log-likelihood;
the information matrix A_tt is jax.hessian of the negative mean
log-likelihood. Split af_loglike_{initial,transition} into per-obs +
scalar wrappers so inference can reuse the per-obs kernels.
Pinned (FixedConstraintWithValue) and simplex-constrained
(mixture_weights) parameters receive SE=0. Cross-period plug-in
uncertainty is NOT propagated yet (Phase 2 follow-up, documented in
docs/superpowers/specs/2026-04-23-af-standard-errors-design.md).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Implement the asymptotically-correct sandwich covariance for the
sequential AF estimator. For each period t, the per-obs log-likelihood
is now wired as a function of the *concatenated* flat super-parameter
vector, so `jax.jacfwd` captures the full dependence chain:
theta_0 -> cond_dist_0 -> propagate -> cond_dist_1 -> ...
Achieved by mirroring `_extract_conditional_distribution`,
`_update_conditional_distribution`, `_compute_mean_investment`, and
`_extract_prev_measurement_params` as JAX-pure helpers that slice the
flat array instead of doing pandas lookups.
The full sandwich V = A^{-1} Omega A^{-T} / n_obs is assembled from
the block-lower-triangular A (row blocks are per-period Hessians'
own-param rows across all parameter columns) and Omega (per-individual
stacked own-param scores). Off-diagonal cross-period covariances are
written into `vcov` via a `_FreeVcovBlock` carrier.
`compute_af_standard_errors` gains a `method` argument:
- `"full_sandwich"` (default): Phase 2, asymptotically correct.
- `"block_diagonal"`: Phase 1, conservative per-period blocks.
Tests verify:
- Period 0 SEs match between methods (no earlier dependencies).
- Period 2's full-sandwich SE >= block-diagonal SE (plug-in uncertainty).
- Cross-period covariance block is non-zero in full sandwich.
- Unknown `method` raises ValueError.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Code reviewNo issues found. Checked for bugs and CLAUDE.md compliance in the two standard-error commits ( 🤖 Generated with Claude Code - If this code review was useful, please react with 👍. Otherwise, react with 👎. |
Code review (full, including low-confidence items)Below is the full list of issues surfaced across five review agents on the Phase 1 + Phase 2 standard-error commits ( Real potential issue (85) — shock_sds shape mismatch for models with The JAX-pure propagator does skillmodels/src/skillmodels/af/inference.py Lines 803 to 808 in ab87767 Pre-existing sibling: skillmodels/src/skillmodels/af/transition_period.py Lines 730 to 734 in ab87767 CLAUDE.md: AGENTS.md says: "Suppress errors with CLAUDE.md: internal dataclass uses The repo CLAUDE.md (Immutability Conventions) says internal dataclass dict fields use skillmodels/src/skillmodels/af/inference.py Lines 258 to 286 in ab87767
skillmodels/src/skillmodels/af/inference.py Lines 293 to 299 in ab87767 CLAUDE.md: multiple assertions per test (unscored) AGENTS.md says "One assertion per test". Several tests pack 2-4 independent assertions, e.g.: skillmodels/tests/test_af_inference.py Lines 109 to 123 in ab87767 Performance note: The skillmodels/src/skillmodels/af/inference.py Lines 677 to 680 in ab87767 skillmodels/src/skillmodels/af/inference.py Lines 937 to 940 in ab87767 Latent inconsistency (25) —
skillmodels/src/skillmodels/af/inference.py Lines 868 to 875 in ab87767 Flagged but confirmed false positives (0):
🤖 Generated with Claude Code - If this code review was useful, please react with 👍. Otherwise, react with 👎. |
"filtered" was a Kalman term that did not fit AF/AMN. Replace the single get_filtered_states with two functions in common/individual_states.py: - get_individual_states(data, result): dispatches on the result type (CHS / AF / AMN), reading model_spec off the result. Drops the af_result / amn_result kwargs. - get_individual_states_from_params(model_spec, data, params): the CHS-only Kalman-from-raw-parameters escape hatch. Documents the return asymmetry (CHS returns anchored + unanchored; AF and AMN return unanchored only). Deletes chs/filtered_states.py and exposes the unified public surface (estimate_chs, estimate_af, estimate_amn, the result/options types, get_individual_states[_from_params], get_maximization_inputs) from the top-level package. Migrates the docs notebooks to the new API. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The two preceding commits (c58b200, e11885c) landed two test files that ruff flags: test_chs_estimate.py had a two-line docstring summary (D205) and test_estimation_protocols.py had an over-long parametrize line (E501). Collapse the summary to one line and let ruff-format wrap the parametrize. No behavior change; the full suite stays green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
`estimate_chs` now wraps `estimagic.estimate_ml` instead of a bare `optimagic.maximize`, so the returned `CHSEstimationResult` carries full ML inference (standard errors / covariances / summaries). This is what lets the apps (skane-struct-bw, health-cognition) adopt the one-call interface instead of hand-rolling estimate_ml on top of get_maximization_inputs. - CHSEstimationResult gains `likelihood_result` (the estimagic LikelihoodResult; `.se()` / `.cov()` / `.summary()` / `.p_values()`). - CHSEstimationOptions gains `estimate_ml_options`, the generic estimagic pass-through (logging, hessian, jacobian, ...); `optimizer_algorithm` / `optimizer_options` map to estimate_ml's optimize_options algorithm / algo_options. - Default `hessian=False` (OPG / jacobian-based SEs): the numerical Hessian costs O(n_params**2) Kalman passes and is prohibitive on real models; override via `estimate_ml_options`. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
`estimate_chs(start_params_strategy="amn")` raised optimagic
`InvalidParamsError` ("Violated constraint at start params"): AMN estimates
parameters per aug_period, so its raw overlay violated the within-stage
`PairwiseEqualityConstraint`s. The "amn" branch filled via Spearman (which
pools stage-tied params) but then overlaid the per-period AMN values,
re-breaking the ties, and never re-pooled.
Encapsulate AMN seeding in `start_values.get_amn_start_params`, which
re-pools `transition` / `shock_sds` within each stage (reusing
`_pool_within_stage_equality`) after the AMN overlay, so the seeded start
point is feasible. Add a regression test that the AMN-seeded template passes
optimagic's start-feasibility check.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
On correction models the control-function categories (`investment_eq`
first-stage coefficients, `kappa` control-function loadings) were left NaN
by `_apply_neutral_defaults` — they are not produced by the moment / AMN
overrides for every model — so the AMN-seeded start point contained NaNs and
optimagic rejected it ("parameter vector must not contain NaNs"). Seed both
to 0 (no first-stage relationship and no correction initially; small, as the
optimiser moves them). Add a unit test.
Also make `test_cf_recovery` build its true-params template with
`start_params_strategy="none"`: it relied on `value.isna()` to detect free
params, which the new seeding fills, so the strategy must leave free entries
NaN for `_fill_true_params` to populate every one (its documented intent).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
`_get_instrument_exclusion_constraints` called `getattr(t_f_module,
f"params_{tname}")` unconditionally, which raised `AttributeError` for a
correction target whose production transition is a custom registered function
(no built-in `params_<name>` enumerator) — e.g. skane-struct-bw's translog
`f_health`. The built-in safe-by-construction pinning does not apply to custom
transitions; instrument leakage through them is validated separately by
`check_model`. Skip them (mirrors the existing `getattr(..., default)` pattern
used elsewhere in this module). Add a regression test.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Make AMN-seeded estimate_chs feasible for translog/correction models whose production or control function carries higher-order terms: - Add `linearize_control_function` to estimate_amn / simulate_and_regress so AMN fits only the linear `cf` term when seeding, skipping the higher-order `kappa_terms` NotImplementedError gate. The CHS seeding branch passes it. - `_apply_neutral_defaults` now seeds higher-order terms (a space in `name2`: translog interactions/squares and higher-order cf terms) to a small 0.01 instead of the linear category defaults, since the linear seeds never produce them. - estimate_chs reconciles the seeded start point onto user EqualityConstraints via `reconcile_start_to_equality`, so om.minimize no longer raises InvalidParamsError on cross-period structural equalities. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A correction target's individual transition function is the grafted DAG that reads the reserved first-stage (`__first_stage_<inv>__`) and kappa (`__kappa_<target>__`) coefficient keys on top of its own production coefficients. `_prepare_plot_data_for_factor_pair` forwarded only the output factor's key, so evaluating the transition raised `KeyError: '__first_stage_<inv>__'`. Forward every transition-coeffs key sliced at the period instead; the DAG ignores keys it does not need, so non-correction plots are unchanged. This surfaced only once correction models began estimating successfully (the seeding/equality fixes in the previous commit); before that the plot task never ran. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
AMN's mixture EM is complete-case only: a row survives only if every augmented measurement column is observed. Rotating-subsample indicators (missing for most person-waves, e.g. HRS Leave-Behind items) drive the complete-case count to zero, so AMN seeding raised `Only 0 complete-case rows available for N-component mixture` even though the full CHS Kalman filter handles the missingness fine. Add `reduce_to_seedable_measurements`: when the full augmented vector has fewer complete cases than `n_mixtures`, drop every non-normalization measurement whose missing rate exceeds `AMNEstimationOptions.seed_subsample_cutoff` (default 0.5) and seed the mixture on the always-observed subset. Each factor's normalization measurement is protected. The reduced layout flows through Stages 2-3 unchanged (they are purely layout-driven); dropped measurements simply aren't AMN-seeded and fall back to the Spearman/neutral start defaults. A `RuntimeWarning` reports exactly which measurements were dropped. The path engages only when the mixture would otherwise be infeasible, so models with enough complete cases are byte-for-byte unaffected. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The fixed-cutoff subsample drop (e1f2216) was too timid: on an unbalanced panel the complete-case count can stay zero even after dropping the obviously-subsample (>50% missing) measurements, because moderately-missing measurements still block joint completeness. Replace the one-shot cutoff with a greedy drop: remove the highest-missing non-normalization measurements one at a time until at least `seed_min_complete_cases` (default 50) complete cases remain. Crucially, drop without a per-measurement "does this single drop help" gate -- a measurement's two period-slots often must go together before completeness improves. Commit the reduction only if it actually reaches feasibility (>= n_components complete cases); otherwise return the inputs untouched so fit_mixture_em raises its informative complete-case error on the full set. Normalizations are never dropped. Replaces the `seed_subsample_cutoff` option with `seed_min_complete_cases`. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
AMN Stage-1's complete-case GaussianMixture is infeasible on an unbalanced
panel: when no individual spans every period (e.g. age-binned waves with
attrition) there are zero complete-case rows even on the always-observed
measurements, and the subsample drop cannot fill a hole in the time dimension.
Add `missing_data_em.fit_gaussian_mixture_missing`: a Gaussian-mixture EM that
marginalises over each row's missing entries (Ghahramani & Jordan 1994; Hunt &
Jorgensen 2003). The E-step scores each observation on its observed sub-vector;
the M-step fills missing entries with their per-component conditional mean and
adds the conditional-covariance correction. It fits even with no complete rows.
Expose it via `AMNEstimationOptions.mixture_em_method` ∈ {complete_case,
missing_data, auto}, default `auto`: use complete-case (after the subsample
drop) when a feasible subset exists, else fall back to the missing-data EM over
the full measurement set. `fit_mixture_em` dispatches on the method;
`estimate_amn._seed_stage1_mixture` orchestrates and caps the missing-data fit
at 5000 sampled rows (its cost scales with the number of distinct missing
patterns). Healthy models keep the complete-case path byte-for-byte.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The per-pattern Python loop made Stage-1 missing-data seeding prohibitive on a panel where nearly every individual has a distinct missing pattern (~27s/iter at hc scale: 300-dim augmented vector, ~one pattern per row). Rewrite the E/M steps as dense batched linear algebra via the masked-covariance identity: for a row with observed mask M, G = M Sigma M + (I - M) is block-diagonal [[Sigma_oo, 0], [0, I]], so a single full-dimension batched Cholesky yields the observed-block log-density, log-determinant, conditional mean and the masked inverse M G^-1 M uniformly across rows -- no Python loop over patterns. Rows run in fixed-size padded chunks (valid-mask zeroes the padding) so only (chunk, d, d) arrays are materialised; on GPU the batched Cholesky/solve parallelise. Also right-size the missing-data seeding budget in `_seed_stage1_mixture` (n_init=1, max_iter=50, <=3000 sampled rows): it produces start values only. The 5 missing-data EM tests (matches sklearn on complete data, recovers under MCAR, fits with zero complete cases) still pass. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The GPU-vectorized EM runs ~0.6s/iter at hc scale, so a higher iteration cap is affordable and gives the seed more room to converge from the warm start. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Addresses the Pro audit of the AMN estimator and its use as a CHS start-value seed. EM correctness (missing_data_em.py): - F7: build G and the conditional moments from the bare component covariance (which already carries the M-step ridge) instead of cov + reg*I, so observed entries are exact and the EM reproduces sklearn's covariances at convergence. - F8: add a final score-only E-step so the returned log-likelihood and the cross-restart ranking score the returned params, not the pre-M-step ones. - F4: reject never-observed columns; flag a disconnected co-observation graph (cross_covariance_identified) and warn instead of reporting ordinary convergence alone. Interface (estimate.py, types.py, mixture_em.py): - F2/F3: one interface for standalone use and seeding. mixture_em_method defaults to complete_case and raises InsufficientCompleteCasesError (pointing at missing_data) when infeasible; drop the auto mode and the silent measurement reducer. Seed budget is now an explicit mixture_em_max_rows option. - F6: estimate_amn raises NotImplementedError for start_params, fixed_params and constraints instead of overlaying them post-hoc. Seeding (start_values.py, chs/maximization_inputs.py): - F1: translate AMN's calendar-time / transition+cf coordinates onto the CHS augmented-period / kappa index so control-function seeds reach the right rows (was a raw index intersection that silently dropped them). - The complete-case -> missing-data fallback lives in the CHS seeding caller, keeping the standalone estimator's interface honest. - F9: fail early with the exact offending rows when a seeded start point is non-finite. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A measurement observed for few individuals can vanish entirely when the missing-data EM seeds on a row subsample, leaving an all-missing column. Rejecting that with a hard error broke the unbalanced-panel seeding path that the previous (silent) behaviour tolerated. Warn and seed the column neutrally instead, still reporting cross_covariance_identified=False so the non-identification is surfaced. Also compute the warm-start column means without nanmean to avoid the "Mean of empty slice" warning. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The AMN start-value seeding hung for hours (CPU-bound, GPU idle) on the health-cognition model. The cause was solve_minimum_distance running scipy_lbfgsb with no analytical gradient: optimagic finite-differenced a parameter vector dominated by the Cholesky of the (factors x periods) block (thousands of entries), so each L-BFGS-B step cost n_params dense O(n_aug^2) objective evaluations, run to scipy's default ~15000 iterations. This is the AMN seeding stage; it never reached the CHS likelihood compile. Fix: - Add a JAX-jitted objective with an exact analytical gradient (jax.value_and_grad) and pass it as `jac` to om.minimize, so each step is one cheap backward pass instead of n_params forward evaluations. - Plumb algo_options through solve_minimum_distance; the CHS seeding caller caps it at stopping_maxiter=500 (a seed needs no full convergence), bounding the worst case. Standalone AMN stays unbounded. The criterion is unchanged (the JAX value matches the numpy objective and its finite-difference gradient; recovery is unchanged across the AMN tests). At hc scale (n_aug=192, ~1700 params) Stage 2 drops from 64s to 2.1s with the cap, and full 3-stage seeding runs in ~3s instead of hours. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
estimate_chs calls reconcile_start_to_equality on the seeded start point so equality-constrained groups (filled independently by AMN/Spearman seeding) hold before optimagic checks them. It handled om.EqualityConstraint but silently skipped om.PairwiseEqualityConstraint -- which is how health-cognition ties a measurement's controls/loadings/meas_sds across periods (time invariance). The AMN start-value coordinate fix now correctly seeds those controls per period, so the unpooled pairwise groups tripped InvalidParamsError at the start point for the time-invariant models. Extend reconcile_start_to_equality to also average each element-wise group of a PairwiseEqualityConstraint (aligned select_by_loc MultiIndexes). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
estimate_chs merged user `constraints=` but never enforced the FixedConstraintWithValue targets onto the start vector, so optimagic's plain fixed constraint pinned each parameter at its AMN/Spearman seed instead of the requested value — silently estimating a mis-restricted model (live in health-cognition and skane-struct-bw, which pass value=0/1 restrictions). - estimate_chs now calls enforce_fixed_constraints on the merged constraint list before reconciliation/optimization (F9). - reconcile_start_to_equality is now fixed-aware: an equality / pairwise group containing a fixed member takes that fixed value rather than the group mean, and conflicting fixed values within a group raise (F10). Regression tests: a user FixedConstraintWithValue now lands at its requested value through estimate_chs; pairwise reconcile propagates a fixed member's value and rejects conflicts. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…dit F7/F6) F7: a column observed in no row contributes nothing to the likelihood, so its mean and (co)variances are arbitrary, not estimated. The missing-data EM warned and returned a neutral (ridge) seed, which flowed unmasked into the Stage-2 minimum-distance target moments -- silently fitting structural parameters to noise. `fit_gaussian_mixture_missing` now raises by default on never-observed columns; an explicit `allow_never_observed` opt-in (threaded through `fit_mixture_em` and `AMNEstimationOptions.allow_never_observed_measurements`) keeps the warn+neutral-seed behaviour for the CHS seeding path only, where `estimate_chs` re-fits every parameter from the data. F6: rename `MissingDataMixtureFit.cross_covariance_identified` -> `co_observation_graph_connected` and correct its docstring. Graph connectivity is necessary but NOT sufficient for cross-covariance identification (a connected graph can still leave a never-directly-co-observed pair unidentified), so the flag must not read as an identification certificate. It remains purely advisory. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…dit F8) Under an active control function, AMN Stage-3 built the production design from latent factors only, on the assumption that every observed factor is an excluded instrument. When a corrected model also has observed *production controls* (e.g. skane-struct-bw's at_least_one_kid_bef_preg / college / disp_incHH_pp_log), those were silently dropped: a registered transition that indexed them positionally against the full factor order had its out-of-bounds index clamped by jax.vmap to the last latent column -- no error, corrupted coefficients. The production design now excludes only CorrectionSpec.instruments, keeping latent factors and non-instrument observed controls; the positional user-transition callable indexes against the actual production design columns (not the full factor order) so controls land on the right column. When every observed factor is an instrument this reduces to the previous latent-only design, so existing CF tests are unchanged. In both applications AMN only seeds estimate_chs (which re-fits every coefficient), so this corrupted seed values, not the reported MLE estimates. Regression test: a corrected model with a non-instrument observed control recovers that control's production coefficient; the excluded instrument stays out. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
F12: rewrite reference_guides/endogeneity_corrections.md around the CorrectionSpec interface (declared on the endogenous investment FactorSpec.correction, read by both CHS and AMN), generate_kappa_terms, the with_correction/without_correction builders, and an estimator support table (CHS full basis; AMN linear-cf only, higher order raises; AF rejects). The old is_correction / manual-decorator / period-augmentation route is moved to a clearly-labelled legacy/migration section (is_correction is no longer a FactorSpec field). F11: how_to_compare_estimators.md called CHS's default standard errors an "analytic sandwich". estimate_chs defaults hessian=False, so the default covariance is the OPG / inverse-score (information-equality) form, not the misspecification-robust sandwich (which needs the Hessian, available via estimate_ml_options). Also fixed stale result attributes (.all_params -> .params, summary via result.likelihood_result) and a broken get_filtered_states reference (-> get_individual_states). F13: how_to_estimate_af/amn examples used stale names (n_mixture_components, af_options/amn_options, initialization_strategy, result.all_params, the removed investment_endogeneity flag); corrected to the verified current API (ModelSpec .n_mixtures, options=, start_params_strategy, result.params, CorrectionSpec). F14: index.md now lists estimate_chs (turnkey CHS estimator), CorrectionSpec and generate_kappa_terms as public, and drops the removed get_filtered_states import. F5: soften the Wiswall-Agostinelli normalization claim -- the library supports compatible normalization schemes but the model checker does only syntactic checks and does not by itself guarantee identification of an arbitrary scheme. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The F8 change threaded the per-period production-design column names into the user-transition callable's argument lookup. That broke any registered transition referencing a factor the AMN structural panel does not simulate (e.g. skane-struct-bw's observed production controls): `factor_names.index(name)` raised `ValueError: tuple.index(x): x not in tuple` during AMN seeding, failing get_maximization_inputs for the affected models. Revert the threading: the user callable again looks its arguments up against the full `(*latent, *observed)` factor order, so the lookup never raises; an argument absent from the (narrower) design row is read past its end and clamped by jax -- a throwaway seed the CHS MLE re-fits. The production-design change (exclude only instruments, keep non-instrument observed controls) is retained for the name-based linear / log_ces fitters, where it is exercised and tested. Regression test: a registered transition that names an observed factor missing from the simulated panel resolves and runs without raising. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Applications pass model.periods (a numpy ndarray) as the `periods` argument, but the beartype'd public signature declared the strict Sequence[int], which rejects an ndarray at the call boundary (DiagnosticsCallError, seen in health-cognition's transition-equation figures). Widen the hint to accept np.ndarray and coerce entries to plain int internally (a numpy array yields np.int64, which beartype rejects against int in the downstream list[int]-typed helpers). Regression test passes periods=np.array([...]). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
estimate_amn previously rejected fixed_params outright. Honour pins in the stage that fits each parameter so they stay consistent with the criterion: - Stage 2 (minimum distance): loadings/controls/meas_sds pins map to free entries of the packed optimiser vector, held via om.FixedConstraint (meas_sds pinned as variance = sd**2). Pinning a normalized/non-free entry raises. - Stage 3 (simulate-and-regress): transition pins partial out of the OLS design (_fit_linear) or hold fixed-theta in the NLS (_fit_generic_nls). Categories that are derived outputs (shock_sds, investment_*) plus start_params/constraints still raise NotImplementedError. Also guard standalone AMN against restricted-CES transitions (log_ces / log_ces_with_constant): the Stage-3 CES regression omits Freyberger (2025) primitive-scale recovery, so a standalone result would be inconsistent. A new for_start_values flag, set by every seeding site (estimate_af and both estimate_chs sites), skips the guard since those re-fit every parameter; standalone callers and the AMN bootstrap stay guarded. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
validate_af_model previously checked only normalizations is None, so a Normalizations object with empty per-period maps passed and could leave a factor's initial-distribution affine orbit (scale + location) unpinned -- an under-identified trans-log model. Pass fixed_params and constraints into validate_af_model (they were already in scope at the call site, next to fail_if_unsupported_kappa_params) and require, for each has_initial_distribution factor measured at period 0, both a loading anchor and an intercept anchor. An anchor is a normalization map entry, a fixed_params pin, or a select_by_loc equality-constraint member (reusing _equality_constraint_loc). Missing -> ValueError. Periods t>0 are deliberately not checked: the transition can legitimately propagate the anchor, so verifying them needs a transition-aware identification diagnostic (tracked separately, audit P4). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…udit F1) _validate_ces_normalizations described itself as enforcing a "Freyberger-minimal" primitive normalization set. That is wrong: for the restricted CES (psi=1) Freyberger requires only ONE primitive scale anchor; the later skill and investment loadings are identified by the CES restrictions, so one anchor per factor-period over-restricts the primitive model. What the guard actually checks is the scale of AMN's internal *transformed* (tilde) factor coordinates in Stages 1-2, where one anchor per factor-period is correct. Rename to _validate_ces_stage2_anchors and rewrite the docstring and messages accordingly. The restricted-log_ces rejection F1 also asks for is already enforced by estimate_amn's standalone guard. Note the known P4 gap (only CES-transition factors are examined). Behaviour unchanged; the error substrings tests match on are preserved. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…F2/F6/F7a/F8) F8: plain log_ces / log_ces_af bake in the simplex gamma_1+...+gamma_n=1, which IS the skills-location restriction (Freyberger a:skills_ces(b)). The AF anchor check no longer ALSO requires an intercept (location) anchor for those factors (that would over-restrict); log_ces_with_constant (free level) still requires one. Documented in the transition-function docstrings. F7a: the AF normalization error message is now transition-specific -- trans-log needs a per-factor affine anchor at the initial period; restricted CES (psi=1) needs only the single scale anchor lambda_theta,0,1=1, since CES identifies the remaining loadings (paper line 831). F6: extract the transition-aware initial-period anchor check into common/identification.py::check_identification (estimator-agnostic, reusable); validate_af_model delegates to it. Deliberately NOT wired into process_model, so it does not gate the application PyTask pipelines (per the F6 deferral decision). F2: amn/ces_recovery.py implements Freyberger's primitive scale recovery (recover_primitive_ces_scales, paper 1357-1366): from a log_ces_general fit in transformed coordinates + the single anchor lambda_theta,0,1=1, recover the primitive sigma_t, lambda_theta,t,1, lambda_I,t,1 recursively. A test proves log_ces_general represents the transformed CES that the old single-rho _fit_log_ces cannot (the audit's 0.544 counterexample). The full in-pipeline integration (replacing _fit_log_ces, rescaling Stage-2 outputs, lifting the standalone guard) is deferred and flagged for external review -- AMN is start-values-only, so its primitive CES consistency does not propagate. Difficult decisions are logged in audit-ingest/freyberger-id/P4_DECISIONS.md and marked with @Pro: in the code for the ChatGPT Pro audit bundle. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…arms) Pro review of the P4 patch returned serious_gap. The recovery algebra was confirmed correct (randomized inversion ~9e-16); the identification fixes were not. Acting on each finding: F1 (shipped bug): the CES simplex does NOT replace the absolute initial location anchor -- plain log_ces obeys f(x+c,i+c)=f(x,i)+c, leaving a common-shift orbit (Pro reproduced to 6.7e-16). Revert: log_ces / log_ces_af again require a period-0 intercept anchor. Drop _SIMPLEX_LOCATION_TRANSITIONS; correct the transition-function docstrings (simplex replaces only the cross-period location alternative). F3: check_identification now (a) counts an equality group as an anchor only when connected to a numerically fixed or normalized member (_normalized_keys + anchor_sources), and (b) is reframed as an INITIAL-ANCHOR PRECHECK (initial_anchor_ok; later identification unverified), not a complete proof. F4: the generic Stage-3 NLS seed recognized only phi/rho/sigma, so log_ces_general (sigma_<factor>, tfp) started all-zero and evaluated tfp*log(0)=NaN. Add _is_elasticity_param (covers sigma_* and tfp); seed gammas strictly positive so the start point is finite. F6: log_ces_af was missing from the restricted-CES guard and the Stage-3 specialized-CES dispatch (it fell through to a string and raised). Add to both. F8: recover_primitive_ces_scales validates its domain (_require_nonzero_finite on the anchor and transformed exponents/outside); documents psi_t=1 and the excluded Cobb-Douglas sigma=0 limit. F9/F10 (docs): expand the recovery plan (shock SDs, kappa, block-diagonal mixture transform); annotate rho_1 = rho~_1 * lambda_theta,T,1 (terminal scale; the paper's generic t is a typo). F2 (documented limitation): restricted-CES scale is per connected skill-investment component, not per factor -- not implemented here (error-prone; the F1 lesson). In the standard setup (investment has no initial distribution) the precheck requires exactly one scale anchor, which is correct. F5/F7 were false alarms: for_start_values=True and the validate_af_model(fixed_params, constraints) forwarding are already wired (Pro's P4 bundle omitted af/estimate.py and chs/maximization_inputs.py). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…validation Second-round review (serious_gap). Library identification/recovery fixes: F1: check_identification now reads parameter VALUES, not just index membership. A loading (scale) anchor must be finite and nonzero; an intercept (location) anchor finite. A loading pinned to 0/NaN/inf no longer certifies an underidentified model. _valid_fixed_keys + _normalized_keys filter by value. F2: equality-anchor propagation is now transitive (_equality_closure iterates to a fixed point): A fixed, A=B, B=C anchors C regardless of how the single equality component is split across constraints. F5: recover_primitive_ces_scales validates the DERIVED sigma / lambda_I / lambda_theta_next each period -- finite nonzero inputs can still overflow their products/ratios to inf, which must raise rather than return infinite scales. F7: the AF normalization error message no longer claims the CES simplex or the log_ces_with_constant free constant can supply the initial location anchor; it requires a finite period-0 measurement-intercept (mu_theta,0,1=0). Tests added for each. F5/F7 false alarms from round 1 stay resolved. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Summary
skillmodelsnow hosts three estimators behind oneModelSpecand one parameter index, reorganised into per-estimator subpackages over a shared core:chs— the existing Cunha–Heckman–Schennach Kalman-filter MLE.af— new: the Antweiler & Freyberger (2025) sequential Halton-quadrature estimator.amn— new: the Attanasio, Meghir & Nix (2020) three-stage mixture-of-normals estimator.All three get a uniform turnkey surface —
estimate_chs/estimate_af/estimate_amn(model_spec, data, options) -> …Resultcarrying inference — whileget_maximization_inputsstays as the CHS power-user hatch. A single declarativeCorrectionSpecdrives the control-function endogeneity correction for both CHS and AMN. The user-facing API is guarded by a typed beartype perimeter, and an external correctness audit was ingested and its findings fixed (see Correctness pass).The diff is large because it also folds in the package reorg (
af/amn/chs/common), the beartype perimeter (previously stacked PR #90), and the audit pass.AF estimator (
src/skillmodels/af/)estimate_af(model_spec, data, af_options, start_params)→AFEstimationResult.ProbabilityConstraintforlog_cesgammas, satisfied at start values.I = β₀ + β₁θ + β₂Y + σ_I εfor endogenous factors.start_paramssupport: user-supplied starting values override heuristic defaults.get_af_posterior_statescomputes quadrature-based posterior means per individual / period (the unifiedget_individual_statesdispatches on result type).compute_af_standard_errors.AMN estimator (
src/skillmodels/amn/)estimate_amn(model_spec, data, options)→AMNEstimationResult— the three-stage Attanasio–Meghir–Nix (2020) estimator.AMNEstimationOptions.mixture_em_method:complete_case(sklearnGaussianMixtureon listwise-complete rows; raisesInsufficientCompleteCasesErrorunder an unbalanced panel) andmissing_data(masked-covariance EM that marginalises over each row's missing entries, MAR — valid with no complete cases at all).solve_minimum_distancerecovers the structural parameters with a jitted analytical JAX gradient (value_and_grad), so the optimisation scales to large factor-period blocks instead of finite-differencing thousands of parameters.simulate_and_regressdraws a synthetic latent panel and recovers the per-period transition / investment parameters by (N)LS.compute_amn_standard_errors) that re-runs every stage per replicate. Posterior states:get_amn_posterior_states.CHS turnkey driver + AMN seeding
estimate_chs(model_spec, data, options)→CHSEstimationResult— a one-call wrapper overget_maximization_inputs+estimagic.estimate_ml, giving CHS the sameestimate_*surface with full ML inference (result.likelihood_result→.se()/.cov()/.summary()).get_maximization_inputsremains the escape hatch for driving the optimiser by hand.CHSEstimationOptions.start_params_strategy="amn"seeds the CHS optimiser by running AMN and translating its calendar/cfcoordinates onto the CHS augmented-period/kappaindex;"spearman"and"none"are also available.Control-function endogeneity correction —
CorrectionSpecCorrectionSpec, declared on the endogenous investmentFactorSpec.correction, configures the investment control function (AF Sec. 3.5 / AMN eq. 7–8) in one place, read by both CHS and AMN.instruments(excluded observed factors entering only the first stage, identifyingkappa),state_predictors,targets, andkappa_degree/kappa_terms.generate_kappa_terms(factors, max_degree)builds thecf-interaction monomial basis.kappabasis; AMN implements the linearcfterm; AF has no correction and raises if one is declared.ModelSpec.with_correction/without_correctionbuilders register / strip it.Package layout
Flat top-level modules were reorganised into per-estimator subpackages
src/skillmodels/{af,amn,chs}/over a sharedsrc/skillmodels/common/(model spec, params index, constraints, transition functions, data/simulation, control function, diagnostics & plotting). Public entry points are re-exported from the top-levelskillmodelspackage.Optimizer
Each period's MLE runs via
optimagic.minimizewith the algorithm inAFEstimationOptions.optimizer_algorithm(default"fides";"scipy_lbfgsb"for MC sweeps). A jaxopt LBFGSB backend was explored and dropped — on the translog n=500 h=10k MC sweep it had ~14% period-0 maxiter timeouts and occasional silent overflow to ±1e17, while AF optimagic had 0% non-convergence on the same data. The per-step API accepts the full set of optimagic constraint kinds (FixedConstraintWithValue,ProbabilityConstraint,EqualityConstraint).AFEstimationResult.to_numpy()estimate_afreturns on-device JAX arrays so consecutive calls reuse the XLA compilation cache (essential for MC sweeps — without this, every sim recompiles every per-period likelihood + gradient). Callers that need host residency (pickling, plotting, sending across processes) explicitly invokeresult.to_numpy(), which dropssamples_per_componentand clears caches as a side effect.Beartype perimeter on user-facing API
Mirrors the pattern in pylcm PR #355: a per-exception
BeartypeConfplus abeartype_initclass decorator routes parameter-type violations at every documented entry point through a skillmodels-specific exception class, so callers can write narrowly-scopedexceptclauses against a stable hierarchy rather than catching beartype's framework exception.Exceptions (
src/skillmodels/exceptions.py)Six
TypeErrorsubclasses of a commonSkillmodelsInputError, organised by perimeter:ModelSpecInitializationError—FactorSpec,AnchoringSpec,ModelSpec,Normalizations,CorrectionSpecOptionsInitializationError—CHSEstimationOptions,AFEstimationOptions,AMNEstimationOptionsEstimationCallError—get_maximization_inputs,estimate_chs,estimate_af,estimate_amn,get_individual_states,get_af_posterior_states,get_amn_posterior_statesInferenceCallError—compute_af_standard_errors,compute_amn_standard_errorsSimulationCallError—simulate_dataset,simulate_policy_effectDiagnosticsCallError— the diagnostics / plotting helpers (decompose_measurement_variance,plot_residual_boxplots,get_transition_plots, …)Decorator + config (
src/skillmodels/_beartype_conf.py)_conf(exc)—BeartypeConfwithviolation_param_type=exc,strategy=BeartypeStrategy.On(full O(n) container scan),is_pep484_tower=True.beartype_init(conf)— class decorator that wraps only__init__.Whole-package
beartype.clawactivation in tests (tests/conftest.py)beartype.claw.beartype_package("skillmodels", conf=...)turns annotation-drift on internal helpers intoBeartypeCallHintParamViolationduring the test run.skillmodels.chs.qris excluded because JAX's@custom_jvp.defjvpattribute doesn't survive beartype's wrap. Activating the claw surfaced ~80 internal annotation drifts, all fixed.Correctness pass (external audit + adversarial re-verification)
A ChatGPT-Pro correctness audit of the math ↔ code — with the
skane-struct-bw/health-cognitioncall-sites bundled in — was ingested, and every serious finding was independently re-verified against the source before fixing (several turned out to be over-escalated; only one affected results):estimate_chsnow enforces userFixedConstraintWithValuevalues before optimising. They were appended after seeding and never written into the start vector, so optimagic's plain fixed constraint pinned them at the AMN/Spearman seed rather than the requested value — silently mis-restricting any model that fixes parameters to specific values (e.g. no-feedback / identity restrictions). Equality / pairwise reconciliation is now fixed-aware.CorrectionSpec.instrumentsare excluded from the production design (with a regression test covering a registered transition that names a factor the simulated panel omits).get_transition_plotsaccepts numpy-arrayperiods(the beartypeSequence[int]boundary rejected ndarrays passed by applications).CorrectionSpec(the old period-augmentation /is_correctiondecorator interface is gone); staleestimate_*how-to examples and the default-CHS-standard-error description (OPG / inverse-score, not the misspecification-robust sandwich) corrected;estimate_chs,CorrectionSpecandgenerate_kappa_termsdocumented.Test plan
pixi run -e tests-cpu pytest tests/ -q -k "not long_running"— all green with beartype.claw enabled (634 tests)pixi run ty— cleanprek run --all-files— cleanhealth-cognitionCHS pipeline runs green on this branch (estimation → tables → figures)pytest -m long_running— MODEL2 AF vs CHS vs AMN comparison🤖 Generated with Claude Code