Assumption Table Examples in Gaspatchio¶

Working with Mortality Tables¶

This guide walks through using the 2015 VBT Female Smoker Mortality Table (ANB) as an example to demonstrate how to set up and use assumption tables in Gaspatchio.

Understanding the Table Structure¶

The 2015 VBT table is structured as follows: - Rows represent issue ages (18-95) - Columns represent policy durations (1-25 plus "Ultimate") - Values represent mortality rates per 1,000

Here's a small sample from the table:

Issue Age	Duration 1	Duration 2	Duration 3	Duration 4	Duration 5	Ultimate	Attained Age
30	0.20	0.25	0.31	0.38	0.45	4.84	55
31	0.21	0.26	0.34	0.42	0.51	5.35	56
32	0.22	0.28	0.37	0.47	0.58	5.93	57
33	0.23	0.31	0.42	0.53	0.65	6.59	58
34	0.25	0.35	0.48	0.61	0.73	7.31	59

Loading the Assumption Table¶

Loading assumption tables is straightforward with the dimension-based API. Gaspatchio provides tools to analyze table structure and configure dimensions:

import gaspatchio_core as gs

# First, analyze the table structure (optional but helpful)
df = pl.read_csv("2015-VBT-FSM-ANB.csv")
schema = gs.analyze_table(df)
print(schema.suggest_table_config())

# Load the mortality table with dimension configuration
vbt_table = gs.Table(
    name="vbt_2015_female_smoker",
    source="2015-VBT-FSM-ANB.csv",
    dimensions={
        "Issue Age": "Issue Age",  # Simple data dimension
        "duration": gs.MeltDimension(
            columns=[str(i) for i in range(1, 26)] + ["Ultimate"],
            name="duration",
            overflow=gs.ExtendOverflow("Ultimate", to_value=200)
        )
    },
    value="mortality_rate"
)

The API explicitly configures: - Data dimensions (like Issue Age) that map directly from columns - Melt dimensions that transform wide columns (1-25, Ultimate) into long format - Overflow strategies that expand "Ultimate" values to higher durations - The value column name for the melted rates

After loading, the internal data looks like this:

Issue Age	duration	mortality_rate
30	1	0.20
30	2	0.25
30	3	0.31
30	4	0.38
30	5	0.45
30	26	4.84
30	27	4.84
30	150	4.84
...	...	...

Using the Assumption Table in ActuarialFrame¶

Now we can use this table for lightning-fast lookups:

# Create a simple policy dataset
policy_data = pl.DataFrame({
    "policy_id": ["A001", "A002", "A003", "A004"],
    "issue_age": [30, 35, 40, 45],
    "duration": [1, 3, 5, 10]
})

# Convert to ActuarialFrame
af = gs.ActuarialFrame(policy_data)

# Look up mortality rates using the table's lookup method
af = af.with_columns(
    vbt_table.lookup({
        "Issue Age": af["issue_age"], 
        "duration": af["duration"]
    }).alias("mortality_rate")
)

print(af)

Result:

shape: (4, 4)
┌──────────┬───────────┬──────────┬───────────────┐
│ policy_id ┆ issue_age ┆ duration ┆ mortality_rate │
│ ---      ┆ ---       ┆ ---      ┆ ---           │
│ str      ┆ i64       ┆ i64      ┆ f64           │
╞══════════╪═══════════╪══════════╪═══════════════╡
│ A001     ┆ 30        ┆ 1        ┆ 0.20          │
│ A002     ┆ 35        ┆ 3        ┆ 0.54          │
│ A003     ┆ 40        ┆ 5        ┆ 1.15          │
│ A004     ┆ 45        ┆ 10       ┆ 4.10          │
└──────────┴───────────┴──────────┴───────────────┘

Working with Overflow Durations¶

The beauty of the API is that overflow handling is completely transparent. Even extreme durations work instantly:

# Test with durations beyond the table (> 25)
extreme_data = pl.DataFrame({
    "policy_id": ["X001", "X002"],
    "issue_age": [30, 40], 
    "duration": [50, 100]  # Way beyond table max of 25!
})

af_extreme = gs.ActuarialFrame(extreme_data)
af_extreme = af_extreme.with_columns(
    vbt_table.lookup({
        "Issue Age": af_extreme["issue_age"], 
        "duration": af_extreme["duration"]
    }).alias("mortality_rate")
)

print(af_extreme)

Result:

shape: (2, 4)
┌──────────┬───────────┬──────────┬────────────────┐
│ policy_id ┆ issue_age ┆ duration ┆ mortality_rate │
│ ---      ┆ ---       ┆ ---      ┆ ---            │
│ str      ┆ i64       ┆ i64      ┆ f64            │
╞══════════╪═══════════╪══════════╪════════════════╡
│ X001     ┆ 30        ┆ 50       ┆ 4.84           │
│ X002     ┆ 40        ┆ 100      ┆ 9.32           │
└──────────┴───────────┴──────────┴────────────────┘

Both policies get the "Ultimate" rate because the ExtendOverflow strategy pre-expanded the overflow during loading.

Projecting Multiple Periods¶

Gaspatchio's vector-based approach works seamlessly with the API:

# Create a policy with projection over multiple durations
policy_projection = pl.DataFrame({
    "policy_id": ["B001"],
    "issue_age": [30],
    "duration": [[1, 2, 3, 4, 5, 25, 26, 50, 100]]  # Mix of regular and overflow
})

af_proj = gs.ActuarialFrame(policy_projection)

# Look up mortality rates for all durations at once
af_proj = af_proj.with_columns(
    vbt_table.lookup({
        "Issue Age": af_proj["issue_age"], 
        "duration": af_proj["duration"]
    }).alias("mortality_rate")
)

# Explode for visualization
result = af_proj.explode(["duration", "mortality_rate"])
print(result)

Result:

shape: (9, 4)
┌──────────┬───────────┬──────────┬───────────────┐
│ policy_id ┆ issue_age ┆ duration ┆ mortality_rate │
│ ---      ┆ ---       ┆ ---      ┆ ---           │
│ str      ┆ i64       ┆ i64      ┆ f64           │
╞══════════╪═══════════╪══════════╪═══════════════╡
│ B001     ┆ 30        ┆ 1        ┆ 0.20          │
│ B001     ┆ 30        ┆ 2        ┆ 0.25          │
│ B001     ┆ 30        ┆ 3        ┆ 0.31          │
│ B001     ┆ 30        ┆ 4        ┆ 0.38          │
│ B001     ┆ 30        ┆ 5        ┆ 0.45          │
│ B001     ┆ 30        ┆ 25       ┆ 4.12          │
│ B001     ┆ 30        ┆ 26       ┆ 4.84          │
│ B001     ┆ 30        ┆ 50       ┆ 4.84          │
│ B001     ┆ 30        ┆ 100      ┆ 4.84          │
└──────────┴───────────┴──────────┴───────────────┘

Loading Simple Curves¶

For 1-dimensional tables (like lapse rates by age), the API is even simpler:

# Load a simple age → lapse rate curve
lapse_table = gs.Table(
    name="lapse_2025",
    source="lapse_curve.csv",
    dimensions={
        "age": "age"  # Simple string shorthand
    },
    value="lapse_rate"
)

# Use it immediately
af = af.with_columns(
    lapse_table.lookup({"age": af["age"]}).alias("lapse_rate")
)

Advanced Features¶

For more complex scenarios, you have full control with the dimension-based API:

# Multi-dimensional table with selective column loading
mortality_table = gs.Table(
    name="mortality_by_gender",
    source="mortality_m_f.csv",
    dimensions={
        "age": "age",
        "gender": gs.MeltDimension(
            columns=["Male", "Female"],
            name="gender"
        )
    },
    value="mortality_rate"
)

# Table with custom overflow limits
salary_table = gs.Table(
    name="salary_scale",
    source="salary_by_service.csv",
    dimensions={
        "grade": "grade",
        "service": gs.MeltDimension(
            columns=[str(i) for i in range(1, 21)] + ["20+"],
            name="service",
            overflow=gs.ExtendOverflow("20+", to_value=50)
        )
    },
    value="scale_factor"
)

# Using computed dimensions
complex_table = gs.Table(
    name="complex_assumptions",
    source=df,
    dimensions={
        "issue_age": "issue_age",
        "policy_year": "policy_year",
        "attained_age": gs.ComputedDimension(
            pl.col("issue_age") + pl.col("policy_year") - 1,
            "attained_age"
        )
    },
    value="assumption_value"
)

Using the TableBuilder Pattern¶

For step-by-step table construction, use the fluent TableBuilder API:

# Build a complex mortality table
mortality_table = (
    gs.TableBuilder("mortality_select_ultimate")
    .from_source("mortality_su.csv")
    .with_data_dimension("issue_age", "IssueAge")
    .with_data_dimension("gender", "Gender")
    .with_melt_dimension(
        "duration",
        columns=[f"Dur{i}" for i in range(1, 16)] + ["Ultimate"],
        overflow=gs.ExtendOverflow("Ultimate", to_value=100),
        fill=gs.LinearInterpolate()  # Interpolate any gaps
    )
    .with_value_column("qx_rate")
    .build()
)

# The table is ready for lookups
af = af.with_columns(
    mortality_table.lookup({
        "issue_age": af["age"],
        "gender": af["sex"],
        "duration": af["policy_duration"]
    }).alias("mortality_rate")
)

Metadata and Table Discovery¶

Tables can include metadata for documentation and discovery:

# Create table with rich metadata
vbt_table = gs.Table(
    name="vbt_2015_complete",
    source="vbt_2015_all.csv",
    dimensions={
        "age": "Age",
        "gender": "Gender",
        "smoking": "Smoker",
        "duration": gs.MeltDimension(
            columns=duration_columns,
            name="duration",
            overflow=gs.ExtendOverflow("Ultimate", to_value=120)
        )
    },
    value="mortality_rate",
    metadata={
        "source": "2015 Valuation Basic Table",
        "basis": "ANB",
        "version": "2015",
        "effective_date": "2015-01-01",
        "description": "Industry standard mortality table",
        "tags": ["mortality", "vbt", "2015", "standard"]
    }
)

# Discover tables
all_tables = gs.list_tables()
print(f"Available tables: {all_tables}")

# Get metadata for a specific table
metadata = gs.get_table_metadata("vbt_2015_complete")
print(f"Table metadata: {metadata}")

# List all tables with metadata
tables_info = gs.list_tables_with_metadata()
for name, meta in tables_info.items():
    print(f"{name}: {meta.get('description', 'No description')}")