Skip to content

Running Models Across Scenarios

A scenario run takes one model and exercises it under many sets of assumptions — base, mortality up, lapse down, interest stress, a thousand stochastic draws. The output is a small set of summary numbers (SCR, BEL, worst case) plus, ideally, an audit chain a regulator can verify.

Two layers do this work:

  • A typed plan (ScenarioRun) — shocks, base tables, aggregations bundled together with a source_sha and an opt-in audit sidecar. The path you use when the run will be reproduced later.
  • A low-level helper (with_scenarios) — cross-joins your model points with a scenario list. The path you use for one-shot exploration when you don't need governance.

Most production scenario work runs through the typed plan. The low-level helper is the escape hatch.


The primary path: ScenarioRun

A ScenarioRun carries your shocks, base tables, and aggregations as a single value. You build it, run it, save it to YAML, hand it to model risk — they reload, rerun, and reproduce your numbers byte-for-byte.

import polars as pl
from gaspatchio_core.assumptions import Table
from gaspatchio_core.frame import ActuarialFrame
from gaspatchio_core.scenarios import ScenarioRun, Sum


disc_rates = Table(
    name="disc_rates",
    source=pl.DataFrame({
        "scenario_id": ["BASE", "BASE", "UP", "UP", "DOWN", "DOWN"],
        "year":        [0, 1, 0, 1, 0, 1],
        "rate":        [0.03, 0.03, 0.05, 0.05, 0.01, 0.01],
    }),
    dimensions={"scenario_id": "scenario_id", "year": "year"},
    value="rate",
)


def policies():
    return ActuarialFrame({
        "policy_id": [1, 2, 3],
        "premium":   [100.0, 200.0, 300.0],
        "year":      [0, 1, 0],
    })


def model(af, *, tables, drivers=None):
    rate = tables["disc_rates"].lookup(
        scenario_id=af["scenario_id"],
        year=af["year"],
    )
    return af.with_columns((af["premium"] / (1.0 + rate)).alias("pv"))


plan = ScenarioRun(
    shocks={"BASE": [], "UP": [], "DOWN": []},
    base_tables={"disc_rates": disc_rates},
    aggregations=(Sum("pv").alias("total_pv"),),
)

result = plan.run(policies(), model, batch_size=1)
print(plan.source_sha())                  # sha256:33fe5f3c76d3...
print(result.aggregations["total_pv"])    # 1714.28

The three pieces that make the typed-plan path useful:

  • Aggregators — 14 built-in reducers (Sum, Mean, CTE, ArgMax, …) with .alias(), .over(), .of() modifiers. The aggregator carries its own column and reduction; you don't hand-roll a group_by.
  • Scenario Run — typed plan + audit sidecar + YAML round-trip + master-seed determinism.
  • Custom Aggregators — write your own (Skewness, TVaR, anything mergeable) and have it round-trip through governance the same way the built-ins do.

The low-level helper: with_scenarios

with_scenarios(af, scenario_ids) cross-joins your model points with a scenario list and returns an ActuarialFrame with a scenario_id column. You write the model the way you'd write a single-scenario model and read af["scenario_id"] from your assumption lookups; aggregation is on you with raw polars.

import gaspatchio_core as gs
from gaspatchio_core import ActuarialFrame

af = ActuarialFrame(pl.read_parquet("model_points.parquet"))
af = gs.with_scenarios(af, ["BASE", "UP", "DOWN"])

# af now has policies × 3 rows with a scenario_id column added.
# Run your model normally, then group_by("scenario_id") at the end.

This pattern is fine for exploratory work — interactive sessions, one-off stress checks, ad-hoc reporting where you'd rather inline the aggregation than declare it. It does not carry a source_sha, doesn't write an audit sidecar, and won't survive a YAML round-trip — so reach for it when the run is disposable.

When the analysis settles, promote it to a ScenarioRun. The model function reshapes slightly (it gains *, tables, drivers=None kwargs) but the projection logic stays the same.


Loading scenario-varying assumptions

Most assumptions stay the same across scenarios (mortality, lapse, premium rates). The economic ones — discount rates, equity returns, inflation — typically vary. Three ways to load them, all returning a Table keyed by scenario_id:

Single table with scenario_id dimension

If your data is in one file with a scenario_id column:

disc_rates = Table(
    name="disc_rates",
    source="disc_rates.parquet",
    dimensions={
        "scenario_id": "scenario_id",
        "year": "year",
    },
    value="disc_rate_ann",
)

# Inside model_fn:
rate = disc_rates.lookup(scenario_id=af["scenario_id"], year=af["year"])

Separate files per scenario

If your scenarios live in separate files (typical for ESG output):

disc_rates = Table.from_scenario_files(
    scenario_files={
        "BASE": "scenarios/BASE/disc_rates.parquet",
        "UP":   "scenarios/UP/disc_rates.parquet",
        "DOWN": "scenarios/DOWN/disc_rates.parquet",
    },
    scenario_column="scenario_id",
    dimensions={"year": "year"},
    value="disc_rate_ann",
    name="disc_rates",
)

Each file is tagged with its scenario_id, concatenated, and exposed as one Table keyed by (scenario_id, year).

Template-based loading

When file naming follows a pattern:

disc_rates = Table.from_scenario_template(
    path_template="scenarios/{scenario_id}/disc_rates.parquet",
    scenario_ids=["BASE", "UP", "DOWN"],
    scenario_column="scenario_id",
    dimensions={"year": "year"},
    value="disc_rate_ann",
)

Equivalent to from_scenario_files, more compact.

All three return a Table you pass into ScenarioRun.base_tables. The loop stacks the table across scenarios automatically before each batch.


Doing Read
Choosing an aggregator Aggregators
Building a reproducible run Scenario Run
Adding a metric that's not built in Custom Aggregators
Stress shocks (multiply, add, clip, pipeline) Shock Operations
Memory at scale Performance
Asking what-if from natural language What-If Analysis