Skip to content

Assumption Table Examples in Gaspatchio

Working with Mortality Tables

This guide walks through using the 2015 VBT Female Smoker Mortality Table (ANB) as an example to demonstrate how to set up and use assumption tables in Gaspatchio.

Understanding the Table Structure

The 2015 VBT table is structured as follows: - Rows represent issue ages (18-95) - Columns represent policy durations (1-25 plus "Ultimate") - Values represent mortality rates per 1,000

Here's a small sample from the table:

Issue Age Duration 1 Duration 2 Duration 3 Duration 4 Duration 5 Ultimate Attained Age
30 0.20 0.25 0.31 0.38 0.45 4.84 55
31 0.21 0.26 0.34 0.42 0.51 5.35 56
32 0.22 0.28 0.37 0.47 0.58 5.93 57
33 0.23 0.31 0.42 0.53 0.65 6.59 58
34 0.25 0.35 0.48 0.61 0.73 7.31 59

Loading the Assumption Table

Loading assumption tables is straightforward with the dimension-based API. Gaspatchio provides tools to analyze table structure and configure dimensions:

import gaspatchio_core as gs

# First, analyze the table structure (optional but helpful)
df = pl.read_csv("2015-VBT-FSM-ANB.csv")
schema = gs.assumptions.analyze_table(df)
print(schema.suggest_table_config())

# Load the mortality table with dimension configuration
vbt_table = gs.Table(
    name="vbt_2015_female_smoker",
    source="2015-VBT-FSM-ANB.csv",
    dimensions={
        "issue_age": "Issue Age",  # Map model name to CSV column name
        "duration": gs.assumptions.MeltDimension(
            columns=[str(i) for i in range(1, 26)] + ["Ultimate"],
            name="duration",
            overflow=gs.assumptions.ExtendOverflow("Ultimate", to_value=200)
        )
    },
    value="mortality_rate"
)

The API explicitly configures:

  • Dimension mapping: The key (issue_age) is the name you use in lookups; the value ("Issue Age") is the column name in the source file
  • Melt dimensions: Transform wide columns (1-25, Ultimate) into long format
  • Overflow strategies: Expand "Ultimate" values to higher durations
  • Value column name: Name for the melted rates

After loading, the internal data looks like this:

issue_age duration mortality_rate
30 1 0.20
30 2 0.25
30 3 0.31
30 4 0.38
30 5 0.45
30 26 4.84
30 27 4.84
30 150 4.84
... ... ...

Using the Assumption Table in ActuarialFrame

Now we can use this table for lightning-fast lookups.

Why So Fast?

This VBT table has dimensions [78 ages × 200 durations] = 15,600 entries. Gaspatchio detects this as a dense table and stores it as a contiguous array in memory. Each lookup is just:

  1. Compute index: age_offset × 200 + duration (a few nanoseconds)
  2. Read value: data[index] (direct memory access)

No hash computation, no bucket probing - just arithmetic and array indexing. This is why 324 million lookups complete in ~1 second instead of ~27 seconds.

# Create a simple policy dataset
policy_data = pl.DataFrame({
    "policy_id": ["A001", "A002", "A003", "A004"],
    "issue_age": [30, 35, 40, 45],
    "duration": [1, 3, 5, 10]
})

# Convert to ActuarialFrame
af = gs.ActuarialFrame(policy_data)

# Look up mortality rates using the table's lookup method
af.mortality_rate = vbt_table.lookup(issue_age=af.issue_age, duration=af.duration)

print(af)

Result:

shape: (4, 4)
┌──────────┬───────────┬──────────┬───────────────┐
│ policy_id ┆ issue_age ┆ duration ┆ mortality_rate │
│ ---      ┆ ---       ┆ ---      ┆ ---           │
│ str      ┆ i64       ┆ i64      ┆ f64           │
╞══════════╪═══════════╪══════════╪═══════════════╡
│ A001     ┆ 30        ┆ 1        ┆ 0.20          │
│ A002     ┆ 35        ┆ 3        ┆ 0.54          │
│ A003     ┆ 40        ┆ 5        ┆ 1.15          │
│ A004     ┆ 45        ┆ 10       ┆ 4.10          │
└──────────┴───────────┴──────────┴───────────────┘

Working with Overflow Durations

The beauty of the API is that overflow handling is completely transparent. Even extreme durations work instantly:

# Test with durations beyond the table (> 25)
extreme_data = pl.DataFrame({
    "policy_id": ["X001", "X002"],
    "issue_age": [30, 40], 
    "duration": [50, 100]  # Way beyond table max of 25!
})

af_extreme = gs.ActuarialFrame(extreme_data)
af_extreme.mortality_rate = vbt_table.lookup(
    issue_age=af_extreme.issue_age,
    duration=af_extreme.duration
)

print(af_extreme)

Result:

shape: (2, 4)
┌──────────┬───────────┬──────────┬────────────────┐
│ policy_id ┆ issue_age ┆ duration ┆ mortality_rate │
│ ---      ┆ ---       ┆ ---      ┆ ---            │
│ str      ┆ i64       ┆ i64      ┆ f64            │
╞══════════╪═══════════╪══════════╪════════════════╡
│ X001     ┆ 30        ┆ 50       ┆ 4.84           │
│ X002     ┆ 40        ┆ 100      ┆ 9.32           │
└──────────┴───────────┴──────────┴────────────────┘

Both policies get the "Ultimate" rate because the ExtendOverflow strategy pre-expanded the overflow during loading.

Projecting Multiple Periods

Gaspatchio's vector-based approach works seamlessly with the API:

# Create a policy with projection over multiple durations
policy_projection = pl.DataFrame({
    "policy_id": ["B001"],
    "issue_age": [30],
    "duration": [[1, 2, 3, 4, 5, 25, 26, 50, 100]]  # Mix of regular and overflow
})

af_proj = gs.ActuarialFrame(policy_projection)

# Look up mortality rates for all durations at once
af_proj.mortality_rate = vbt_table.lookup(
    issue_age=af_proj.issue_age,
    duration=af_proj.duration
)

# Explode for visualization
result = af_proj.explode(["duration", "mortality_rate"])
print(result)

Result:

shape: (9, 4)
┌──────────┬───────────┬──────────┬───────────────┐
│ policy_id ┆ issue_age ┆ duration ┆ mortality_rate │
│ ---      ┆ ---       ┆ ---      ┆ ---           │
│ str      ┆ i64       ┆ i64      ┆ f64           │
╞══════════╪═══════════╪══════════╪═══════════════╡
│ B001     ┆ 30        ┆ 1        ┆ 0.20          │
│ B001     ┆ 30        ┆ 2        ┆ 0.25          │
│ B001     ┆ 30        ┆ 3        ┆ 0.31          │
│ B001     ┆ 30        ┆ 4        ┆ 0.38          │
│ B001     ┆ 30        ┆ 5        ┆ 0.45          │
│ B001     ┆ 30        ┆ 25       ┆ 4.12          │
│ B001     ┆ 30        ┆ 26       ┆ 4.84          │
│ B001     ┆ 30        ┆ 50       ┆ 4.84          │
│ B001     ┆ 30        ┆ 100      ┆ 4.84          │
└──────────┴───────────┴──────────┴───────────────┘

Loading Simple Curves

For 1-dimensional tables (like lapse rates by age), the API is even simpler:

# Load a simple age → lapse rate curve
lapse_table = gs.Table(
    name="lapse_2025",
    source="lapse_curve.csv",
    dimensions={
        "age": "age"  # Simple string shorthand
    },
    value="lapse_rate"
)

# Use it immediately
af.lapse_rate = lapse_table.lookup(age=af.age)

Advanced Features

For more complex scenarios, you have full control with the dimension-based API:

# Multi-dimensional table with selective column loading
mortality_table = gs.Table(
    name="mortality_by_gender",
    source="mortality_m_f.csv",
    dimensions={
        "age": "age",
        "gender": gs.assumptions.MeltDimension(
            columns=["Male", "Female"],
            name="gender"
        )
    },
    value="mortality_rate"
)

# Table with custom overflow limits
salary_table = gs.Table(
    name="salary_scale",
    source="salary_by_service.csv",
    dimensions={
        "grade": "grade",
        "service": gs.assumptions.MeltDimension(
            columns=[str(i) for i in range(1, 21)] + ["20+"],
            name="service",
            overflow=gs.assumptions.ExtendOverflow("20+", to_value=50)
        )
    },
    value="scale_factor"
)

# Using computed dimensions
complex_table = gs.Table(
    name="complex_assumptions",
    source=df,
    dimensions={
        "issue_age": "issue_age",
        "policy_year": "policy_year",
        "attained_age": gs.assumptions.ComputedDimension(
            pl.col("issue_age") + pl.col("policy_year") - 1,
            "attained_age"
        )
    },
    value="assumption_value"
)

Using the TableBuilder Pattern

For step-by-step table construction, use the fluent TableBuilder API:

# Build a complex mortality table
mortality_table = (
    gs.TableBuilder("mortality_select_ultimate")
    .from_source("mortality_su.csv")
    .with_data_dimension("issue_age", "IssueAge")
    .with_data_dimension("gender", "Gender")
    .with_melt_dimension(
        "duration",
        columns=[f"Dur{i}" for i in range(1, 16)] + ["Ultimate"],
        overflow=gs.assumptions.ExtendOverflow("Ultimate", to_value=100),
        fill=gs.assumptions.LinearInterpolate()  # Interpolate any gaps
    )
    .with_value_column("qx_rate")
    .build()
)

# The table is ready for lookups
af.mortality_rate = mortality_table.lookup(
    issue_age=af.age,
    gender=af.sex,
    duration=af.policy_duration
)

Metadata and Table Discovery

Tables can include metadata for documentation and discovery:

# Create table with rich metadata
vbt_table = gs.Table(
    name="vbt_2015_complete",
    source="vbt_2015_all.csv",
    dimensions={
        "age": "Age",
        "gender": "Gender",
        "smoking": "Smoker",
        "duration": gs.assumptions.MeltDimension(
            columns=duration_columns,
            name="duration",
            overflow=gs.assumptions.ExtendOverflow("Ultimate", to_value=120)
        )
    },
    value="mortality_rate",
    metadata={
        "source": "2015 Valuation Basic Table",
        "basis": "ANB",
        "version": "2015",
        "effective_date": "2015-01-01",
        "description": "Industry standard mortality table",
        "tags": ["mortality", "vbt", "2015", "standard"]
    }
)

# Discover tables
all_tables = gs.list_tables()
print(f"Available tables: {all_tables}")

# Get metadata for a specific table
metadata = gs.get_table_metadata("vbt_2015_complete")
print(f"Table metadata: {metadata}")

# List all tables with metadata
tables_info = gs.list_tables_with_metadata()
for name, meta in tables_info.items():
    print(f"{name}: {meta.get('description', 'No description')}")