Running Models Across Scenarios¶
A scenario run takes one model and exercises it under many sets of assumptions — base, mortality up, lapse down, interest stress, a thousand stochastic draws. The output is a small set of summary numbers (SCR, BEL, worst case) plus, ideally, an audit chain a regulator can verify.
Two layers do this work:
- A typed plan (
ScenarioRun) — shocks, base tables, aggregations bundled together with asource_shaand an opt-in audit sidecar. The path you use when the run will be reproduced later. - A low-level helper (
with_scenarios) — cross-joins your model points with a scenario list. The path you use for one-shot exploration when you don't need governance.
Most production scenario work runs through the typed plan. The low-level helper is the escape hatch.
The primary path: ScenarioRun¶
A ScenarioRun carries your shocks, base tables, and aggregations as a single value. You build it, run it, save it to YAML, hand it to model risk — they reload, rerun, and reproduce your numbers byte-for-byte.
import polars as pl
from gaspatchio_core.assumptions import Table
from gaspatchio_core.frame import ActuarialFrame
from gaspatchio_core.scenarios import ScenarioRun, Sum
disc_rates = Table(
name="disc_rates",
source=pl.DataFrame({
"scenario_id": ["BASE", "BASE", "UP", "UP", "DOWN", "DOWN"],
"year": [0, 1, 0, 1, 0, 1],
"rate": [0.03, 0.03, 0.05, 0.05, 0.01, 0.01],
}),
dimensions={"scenario_id": "scenario_id", "year": "year"},
value="rate",
)
def policies():
return ActuarialFrame({
"policy_id": [1, 2, 3],
"premium": [100.0, 200.0, 300.0],
"year": [0, 1, 0],
})
def model(af, *, tables, drivers=None):
rate = tables["disc_rates"].lookup(
scenario_id=af["scenario_id"],
year=af["year"],
)
return af.with_columns((af["premium"] / (1.0 + rate)).alias("pv"))
plan = ScenarioRun(
shocks={"BASE": [], "UP": [], "DOWN": []},
base_tables={"disc_rates": disc_rates},
aggregations=(Sum("pv").alias("total_pv"),),
)
result = plan.run(policies(), model, batch_size=1)
print(plan.source_sha()) # sha256:33fe5f3c76d3...
print(result.aggregations["total_pv"]) # 1714.28
The three pieces that make the typed-plan path useful:
- Aggregators — 14 built-in reducers (Sum, Mean, CTE, ArgMax, …) with
.alias(),.over(),.of()modifiers. The aggregator carries its own column and reduction; you don't hand-roll agroup_by. - Scenario Run — typed plan + audit sidecar + YAML round-trip + master-seed determinism.
- Custom Aggregators — write your own (Skewness, TVaR, anything mergeable) and have it round-trip through governance the same way the built-ins do.
The low-level helper: with_scenarios¶
with_scenarios(af, scenario_ids) cross-joins your model points with a scenario list and returns an ActuarialFrame with a scenario_id column. You write the model the way you'd write a single-scenario model and read af["scenario_id"] from your assumption lookups; aggregation is on you with raw polars.
import gaspatchio_core as gs
from gaspatchio_core import ActuarialFrame
af = ActuarialFrame(pl.read_parquet("model_points.parquet"))
af = gs.with_scenarios(af, ["BASE", "UP", "DOWN"])
# af now has policies × 3 rows with a scenario_id column added.
# Run your model normally, then group_by("scenario_id") at the end.
This pattern is fine for exploratory work — interactive sessions, one-off stress checks, ad-hoc reporting where you'd rather inline the aggregation than declare it. It does not carry a source_sha, doesn't write an audit sidecar, and won't survive a YAML round-trip — so reach for it when the run is disposable.
When the analysis settles, promote it to a ScenarioRun. The model function reshapes slightly (it gains *, tables, drivers=None kwargs) but the projection logic stays the same.
Loading scenario-varying assumptions¶
Most assumptions stay the same across scenarios (mortality, lapse, premium rates). The economic ones — discount rates, equity returns, inflation — typically vary. Three ways to load them, all returning a Table keyed by scenario_id:
Single table with scenario_id dimension¶
If your data is in one file with a scenario_id column:
disc_rates = Table(
name="disc_rates",
source="disc_rates.parquet",
dimensions={
"scenario_id": "scenario_id",
"year": "year",
},
value="disc_rate_ann",
)
# Inside model_fn:
rate = disc_rates.lookup(scenario_id=af["scenario_id"], year=af["year"])
Separate files per scenario¶
If your scenarios live in separate files (typical for ESG output):
disc_rates = Table.from_scenario_files(
scenario_files={
"BASE": "scenarios/BASE/disc_rates.parquet",
"UP": "scenarios/UP/disc_rates.parquet",
"DOWN": "scenarios/DOWN/disc_rates.parquet",
},
scenario_column="scenario_id",
dimensions={"year": "year"},
value="disc_rate_ann",
name="disc_rates",
)
Each file is tagged with its scenario_id, concatenated, and exposed as one Table keyed by (scenario_id, year).
Template-based loading¶
When file naming follows a pattern:
disc_rates = Table.from_scenario_template(
path_template="scenarios/{scenario_id}/disc_rates.parquet",
scenario_ids=["BASE", "UP", "DOWN"],
scenario_column="scenario_id",
dimensions={"year": "year"},
value="disc_rate_ann",
)
Equivalent to from_scenario_files, more compact.
All three return a Table you pass into ScenarioRun.base_tables. The loop stacks the table across scenarios automatically before each batch.
What to read next¶
| Doing | Read |
|---|---|
| Choosing an aggregator | Aggregators |
| Building a reproducible run | Scenario Run |
| Adding a metric that's not built in | Custom Aggregators |
| Stress shocks (multiply, add, clip, pipeline) | Shock Operations |
| Memory at scale | Performance |
| Asking what-if from natural language | What-If Analysis |