Assumptions API¶
Table¶
gaspatchio_core.assumptions._api.Table
¶
Main assumption table class with dimension-based structure.
This class provides a clean API for creating assumption tables using composable dimension types and strategies, replacing the old monolithic load_assumptions() function.
dimensions
property
¶
Get dimension configuration (returns a copy).
Returns the dimension configuration used to structure this assumption table, providing access to dimension types, processing strategies, and validation rules for model analysis and debugging.
When to use
- Model Analysis: Inspect dimension configuration to understand table structure and lookup requirements.
- Dynamic Lookups: Build lookup calls programmatically based on available dimensions and their configurations.
- Validation: Check dimension compatibility when extending tables or building complex lookup expressions.
Returns:
| Type | Description |
|---|---|
dict[str, Dimension]
|
dict[str, Dimension]: Copy of dimension configuration |
Examples:¶
from gaspatchio_core.assumptions import Table
import polars as pl
data = pl.DataFrame({"age": [30, 35], "rate": [0.001, 0.002]})
table = Table("test", data, {"age": "age"}, "rate")
dims = table.dimensions
print(f"Dimensions: {list(dims.keys())}")
metadata
property
¶
Get metadata for this table.
Returns stored metadata for this assumption table including descriptions, data sources, validation status, and business context that was provided during table creation.
When to use
- Documentation: Access table metadata for automated documentation generation and model reporting.
- Governance: Retrieve data lineage, validation status, and review information for compliance reporting.
- Model Management: Check table metadata for version control, effective dates, and change management.
Returns:
| Type | Description |
|---|---|
dict[str, Any] | None
|
dict[str, Any] | None: Copy of metadata if available, None otherwise |
Examples:¶
from gaspatchio_core.assumptions import Table
import polars as pl
data = pl.DataFrame({"age": [30, 35], "rate": [0.001, 0.002]})
table = Table(
"test", data, {"age": "age"}, "rate", metadata={"source": "2023 Study"}
)
meta = table.metadata
print(f"Source: {meta['source'] if meta else 'None'}")
schema
property
¶
Get the analyzed schema of this table.
Returns comprehensive schema information about the assumption table including column types, value ranges, and structural metadata useful for validation, debugging, and documentation generation.
When to use
- Data Validation: Check table schema before model execution to ensure data types and ranges meet model requirements.
- Debugging: Inspect table structure when troubleshooting lookup failures or data quality issues.
- Documentation: Generate technical documentation showing table structure and data characteristics.
Returns:
| Name | Type | Description |
|---|---|---|
TableSchema |
TableSchema
|
Analyzed schema with column information |
Examples:¶
from gaspatchio_core.assumptions import Table
import polars as pl
data = pl.DataFrame({"age": [30, 35, 40], "rate": [0.001, 0.002, 0.004]})
table = Table("test", data, {"age": "age"}, "rate")
schema = table.schema
print(f"Columns: {len(schema.columns)}")
storage_mode
property
¶
Get the actual storage mode used by this table.
Returns the storage backend actually being used for lookups, which may differ from the requested mode when using "auto". This is useful for verifying that array storage was selected for dense tables.
When to use
- Performance Verification: Check if "auto" mode selected array storage (35x faster) or fell back to hash storage.
- Debugging: Verify storage mode when troubleshooting lookup performance issues.
- Logging: Record actual storage mode for model run diagnostics.
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
The actual storage mode - "hash" or "array" |
Examples:¶
from gaspatchio_core.assumptions import Table
import polars as pl
# Dense table - should use array storage
data = pl.DataFrame(
{
"age": list(range(18, 101)), # 83 ages
"rate": [0.001 * (1 + a / 100) for a in range(18, 101)],
}
)
table = Table(
name="mortality_auto_test",
source=data,
dimensions={"age": "age"},
value="rate",
storage_mode="auto", # Let Rust decide
)
print(f"Requested: auto, Actual: {table.storage_mode}")
# Output: Requested: auto, Actual: array
describe()
¶
Get a human-readable description of the table.
Returns a formatted string describing the table structure, including row count, column information, and dimension configuration. Useful for debugging, documentation, and model analysis.
When to use
- Debugging: Get quick overview of table structure when troubleshooting lookup issues or data problems.
- Documentation: Generate summary information for model documentation and technical specifications.
- Model Analysis: Review table characteristics during model development and validation processes.
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
Human-readable description of the table |
Examples:¶
from gaspatchio_core.assumptions import Table
import polars as pl
data = pl.DataFrame({"age": [30, 35, 40], "rate": [0.001, 0.002, 0.004]})
table = Table("mortality", data, {"age": "age"}, "rate")
print(table.describe())
dimension_values(dimension)
¶
Get unique values for a specific dimension.
Returns a list of all unique values found in the specified dimension column of the assumption table. Useful for understanding the range of lookup keys available and for validation of lookup arguments.
When to use
- Data Validation: Check available dimension values before performing lookups to ensure valid lookup keys.
- Model Analysis: Examine the range of ages, durations, or product types covered by assumption tables.
- Dynamic UI: Build dropdown lists or selection interfaces showing available lookup values for assumption tables.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dimension
|
str
|
Name of the dimension to get values for |
required |
Returns:
| Type | Description |
|---|---|
list[Any]
|
list[Any]: List of unique values in the dimension |
Examples:¶
from gaspatchio_core.assumptions import Table
import polars as pl
data = pl.DataFrame(
{"age": [30, 35, 40, 30, 35], "rate": [0.001, 0.002, 0.004, 0.001, 0.002]}
)
table = Table("test", data, {"age": "age"}, "rate")
ages = table.dimension_values("age")
print(f"Available ages: {sorted(ages)}")
extend(source, dimensions=None, validate=True)
¶
Extend table with additional data slices.
Appends additional data to an existing assumption table, allowing for incremental loading of assumption data from multiple sources or files. The new data undergoes the same dimension processing as the original table and becomes immediately available for lookups in model calculations.
When to use
- Incremental Loading: Add new assumption data slices from multiple files or data sources to build comprehensive tables.
- Time-Based Updates: Append new vintage data to existing assumption tables for model updates and refreshes.
- Multi-Source Integration: Combine assumption data from different systems, departments, or external providers.
- Scenario Analysis: Add alternative assumption sets to existing tables for stress testing and scenario modeling.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source
|
str | Path | DataFrame
|
Additional data to add |
required |
dimensions
|
dict[str, Dimension] | None
|
Dimension overrides for this slice |
None
|
validate
|
bool
|
Whether to validate compatibility |
True
|
Returns:
| Type | Description |
|---|---|
Table
|
Self for chaining |
Examples:¶
Scalar Example: Extending Mortality Table
from gaspatchio_core.assumptions import Table
import polars as pl
# Create initial mortality table
initial_data = pl.DataFrame(
{"age": [30, 35, 40], "mortality_rate": [0.001, 0.002, 0.004]}
)
mortality_table = Table(
name="mortality_extended",
source=initial_data,
dimensions={"age": "age"},
value="mortality_rate",
)
print(f"Initial rows: {len(mortality_table.to_dataframe())}")
# Extend with additional age bands
additional_data = pl.DataFrame(
{"age": [45, 50, 55], "mortality_rate": [0.008, 0.015, 0.025]}
)
mortality_table.extend(source=additional_data)
print(f"After extension: {len(mortality_table.to_dataframe())}")
print(mortality_table.to_dataframe().sort("age"))
Initial rows: 3
After extension: 6
shape: (6, 2)
┌──────┬────────────────┐
│ age ┆ mortality_rate │
│ --- ┆ --- │
│ f64 ┆ f64 │
╞══════╪════════════════╡
│ 30.0 ┆ 0.001 │
│ 35.0 ┆ 0.002 │
│ 40.0 ┆ 0.004 │
│ 45.0 ┆ 0.008 │
│ 50.0 ┆ 0.015 │
│ 55.0 ┆ 0.025 │
└──────┴────────────────┘
Vector Example: Multi-File Lapse Rate Integration
from gaspatchio_core.assumptions import Table
import polars as pl
# Create base mortality table
base_data = pl.DataFrame({
"age": [30, 35, 40],
"rate": [0.001, 0.002, 0.004]
})
table = Table(
name="mortality_extended",
source=base_data,
dimensions={"age": "age"},
value="rate"
)
print("Initial rows:", len(table.to_dataframe()))
# Extend with additional ages
additional_data = pl.DataFrame({
"age": [45, 50],
"rate": [0.008, 0.015]
})
table.extend(source=additional_data)
print("After extension:", len(table.to_dataframe()))
Initial rows: 3
After extension: 5
from_scenario_files(scenario_files, scenario_column, dimensions, value, name=None, validate=True, metadata=None)
classmethod
¶
Create a Table by concatenating per-scenario assumption files.
Loads each file, adds scenario_column with the scenario ID, concatenates all into a single DataFrame, and creates a Table with scenario_column as an additional dimension.
This is useful when assumptions are stored as separate files per scenario (e.g., from an ESG tool that outputs per-scenario returns).
When to use
- ESG Integration: Load per-scenario returns or yield curves from economic scenario generator outputs stored as separate files.
- Stress Testing: Combine base, stressed, and adverse scenario assumption files into a single Table for multi-scenario runs.
- Regulatory Scenarios: Load prescribed regulatory scenarios (e.g., IFRS17, Solvency II) from separate assumption files.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
scenario_files
|
dict[str, str | Path]
|
Mapping of scenario_id -> file path |
required |
scenario_column
|
str
|
Name for the scenario ID column |
required |
dimensions
|
dict[str, str | Dimension]
|
Dimension mapping (excluding scenario, which is added automatically) |
required |
value
|
str
|
Value column name |
required |
name
|
str | None
|
Optional table name (defaults to "from_scenarios") |
None
|
validate
|
bool
|
Whether to validate data on load |
True
|
metadata
|
dict[str, Any] | None
|
Optional metadata dictionary |
None
|
Returns:
| Type | Description |
|---|---|
Table
|
Table with scenario_column added to dimensions |
Examples:
Loading per-scenario rate files:
```python no_output_check from gaspatchio_core.assumptions import Table
rates_table = Table.from_scenario_files( scenario_files={ "BASE": "scenarios/BASE/rates.parquet", "UP": "scenarios/UP/rates.parquet", "DOWN": "scenarios/DOWN/rates.parquet", }, scenario_column="scenario_id", dimensions={"year": "year"}, value="forward_rate", name="discount_rates", ) ```
from_scenario_template(path_template, scenario_ids, scenario_column, dimensions, value, name=None, validate=True, metadata=None)
classmethod
¶
Create a Table from scenario files matching a path template.
Convenience method when scenario files follow a predictable naming pattern. Expands the template with each scenario ID and delegates to from_scenario_files().
When to use
- Templated Paths: When scenario files follow a naming convention
like
scenarios/{scenario_id}/rates.parquetor similar patterns. - Stochastic Scenarios: For thousands of numbered scenarios where manually specifying each path would be impractical.
- Convention over Configuration: When file organization follows a predictable directory structure per scenario.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path_template
|
str
|
Path with {scenario_id} placeholder |
required |
scenario_ids
|
list[str] | list[int]
|
List of scenario IDs to load |
required |
scenario_column
|
str
|
Name for the scenario ID column |
required |
dimensions
|
dict[str, str | Dimension]
|
Dimension mapping (excluding scenario) |
required |
value
|
str
|
Value column name |
required |
name
|
str | None
|
Optional table name |
None
|
validate
|
bool
|
Whether to validate data on load |
True
|
metadata
|
dict[str, Any] | None
|
Optional metadata dictionary |
None
|
Returns:
| Type | Description |
|---|---|
Table
|
Table with scenario_column added to dimensions |
Examples:
Loading files from templated paths:
```python no_output_check from gaspatchio_core.assumptions import Table
Files: scenarios/BASE/returns.parquet, scenarios/UP/returns.parquet¶
returns_table = Table.from_scenario_template( path_template="scenarios/{scenario_id}/returns.parquet", scenario_ids=["BASE", "UP", "DOWN"], scenario_column="scenario_id", dimensions={"t": "t"}, value="inv_return_mth", ) ```
from_shocks(base_table, shocks, value_column)
classmethod
¶
Create multiple shocked tables from a base table and shock specifications.
Takes a base assumption table and a dictionary mapping scenario IDs to lists of shocks. Returns a dictionary of Tables, one for each scenario, with the appropriate shocks applied.
When to use
- Sensitivity Analysis: When you need to create multiple shocked versions of an assumption table for parameter sweeps.
- Ad-hoc Scenarios: When scenario shocks are defined programmatically rather than loaded from files.
- Integration with sensitivity_analysis(): The output from sensitivity_analysis() can be passed directly to this method.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
base_table
|
Table
|
The original assumption table to apply shocks to |
required |
shocks
|
dict[str, list[Shock]]
|
Mapping of scenario ID to list of shocks to apply |
required |
value_column
|
str
|
The column to apply shocks to |
required |
Returns:
| Type | Description |
|---|---|
dict[str, Table]
|
Dictionary mapping scenario IDs to shocked Table instances |
Raises:
| Type | Description |
|---|---|
ValueError
|
If value_column doesn't exist in the base table |
Examples:
Create stressed mortality tables:
```python no_output_check from gaspatchio_core.assumptions import Table from gaspatchio_core.scenarios.shocks import MultiplicativeShock
base_mortality = Table(...) # Load base mortality table
shocks = { "BASE": [], "UP": [MultiplicativeShock(factor=1.2)], "DOWN": [MultiplicativeShock(factor=0.8)], }
tables = Table.from_shocks(base_mortality, shocks, value_column="qx")
tables["BASE"], tables["UP"], tables["DOWN"] are all Table instances¶
Integration with sensitivity_analysis():
```python no_output_check
from gaspatchio_core.assumptions import Table
from gaspatchio_core.scenarios import sensitivity_analysis
import polars as pl
# Create a base table
base_df = pl.DataFrame({"age": [30, 40], "rate": [0.01, 0.02]})
base_table = Table("mortality", base_df, {"age": "age"}, "rate")
shocks = sensitivity_analysis(
table="mortality",
shock_type="multiplicative",
values=[0.9, 1.0, 1.1],
)
tables = Table.from_shocks(base_table, shocks, value_column="rate")
lookup(_dimensions=None, **kwargs)
¶
Create a lookup expression using dimension names.
Generates a high-performance lookup expression that retrieves assumption values from the registered table based on provided dimension keys. The lookup is optimized for vectorized operations and integrates seamlessly with ActuarialFrame workflows for efficient model projections and calculations.
When to use
- Model Projections: Retrieve mortality, lapse, expense, or interest rates during actuarial model calculations and cash flow projections.
- Dynamic Lookups: Perform lookups where dimension values come from model point data or intermediate calculation results.
- Multi-Dimensional Tables: Look up values from tables with multiple dimensions like age, duration, product type, and risk class.
- Vectorized Operations: Execute efficient batch lookups across thousands or millions of policies in model projections.
Can be called in three ways: 1. With keyword arguments for clean dimension names: table.lookup(age=af["age"], duration=af["duration"])
-
With a dictionary for dimension names with spaces or special characters: table.lookup({"policy duration": af["policy_duration_as_int"]})
-
Or both combined: table.lookup({"policy duration": af["duration"]}, age=af["age"])
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
_dimensions
|
dict[str, str | Expr | ColumnProxy] | None
|
Optional dictionary mapping dimension names to columns/expressions |
None
|
**kwargs
|
str | Expr | ColumnProxy
|
Dimension name to column/expression mapping |
{}
|
Returns:
| Type | Description |
|---|---|
Expr
|
Polars expression for the lookup |
Examples:¶
Scalar Example: Simple Mortality Lookup
from gaspatchio_core.assumptions import Table
from gaspatchio_core import ActuarialFrame
import polars as pl
# Create mortality table
mortality_data = pl.DataFrame(
{
"age": [30, 35, 40, 45, 50],
"mortality_rate": [0.001, 0.002, 0.004, 0.008, 0.015],
}
)
mortality_table = Table(
name="mortality_std",
source=mortality_data,
dimensions={"age": "age"},
value="mortality_rate",
)
# Create model data and perform lookup
model_data = {
"policy_id": ["P001", "P002", "P003"],
"current_age": [35, 40, 50],
}
af = ActuarialFrame(model_data)
# Lookup mortality rates
af.mortality_rate = mortality_table.lookup(age=af.current_age)
print(af.collect())
shape: (3, 3)
┌───────────┬─────────────┬────────────────┐
│ policy_id ┆ current_age ┆ mortality_rate │
│ --- ┆ --- ┆ --- │
│ str ┆ i64 ┆ f64 │
╞═══════════╪═════════════╪════════════════╡
│ P001 ┆ 35 ┆ 0.002 │
│ P002 ┆ 40 ┆ 0.004 │
│ P003 ┆ 50 ┆ 0.015 │
└───────────┴─────────────┴────────────────┘
Vector Example: Multi-Dimensional Lapse Lookup
from gaspatchio_core.assumptions import Table
from gaspatchio_core import ActuarialFrame
import polars as pl
# Create multi-dimensional lapse table
lapse_data = pl.DataFrame({
"duration": [1, 1, 2, 2, 3, 3],
"product_type": ["TERM", "WL", "TERM", "WL", "TERM", "WL"],
"lapse_rate": [0.05, 0.03, 0.08, 0.05, 0.12, 0.07]
})
lapse_table = Table(
name="lapse_rates",
source=lapse_data,
dimensions={"duration": "duration", "product_type": "product_type"},
value="lapse_rate"
)
# Create model points with policy data
model_points = {
"policy_id": ["P001", "P002", "P003", "P004"],
"product_code": ["TERM", "WL", "TERM", "WL"],
"policy_year": [1, 2, 3, 1]
}
af = ActuarialFrame(model_points)
# Lookup lapse rates using multiple dimensions
af.lapse_rate = lapse_table.lookup(
duration=af.policy_year,
product_type=af.product_code
)
print(af.collect())
shape: (4, 4)
┌───────────┬──────────────┬─────────────┬────────────┐
│ policy_id ┆ product_code ┆ policy_year ┆ lapse_rate │
│ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ str ┆ i64 ┆ f64 │
╞═══════════╪══════════════╪═════════════╪════════════╡
│ P001 ┆ TERM ┆ 1 ┆ 0.05 │
│ P002 ┆ WL ┆ 2 ┆ 0.05 │
│ P003 ┆ TERM ┆ 3 ┆ 0.12 │
│ P004 ┆ WL ┆ 1 ┆ 0.03 │
└───────────┴──────────────┴─────────────┴────────────┘
to_dataframe()
¶
Export the complete table as a DataFrame.
Returns the complete processed assumption table as a Polars DataFrame, including all key columns and the value column after dimension processing. Useful for data inspection, validation, and integration with external systems.
When to use
- Data Inspection: Export table data for validation, quality checks, and manual review of assumption values.
- Integration: Export assumption data for use in external systems, reporting tools, or alternative calculation engines.
- Debugging: Examine processed table structure and data after dimension transformations and validation.
Returns:
| Type | Description |
|---|---|
DataFrame
|
pl.DataFrame: Complete table with all processed data |
Examples:¶
from gaspatchio_core.assumptions import Table
import polars as pl
data = pl.DataFrame({"age": [30, 35], "rate": [0.001, 0.002]})
table = Table("test", data, {"age": "age"}, "rate")
df = table.to_dataframe()
print(f"Exported {len(df)} rows")
validate_lookup(_dimensions=None, **kwargs)
¶
Validate a lookup configuration without executing.
Checks that a lookup configuration provides all required dimensions and that dimension names match the table's configuration. Useful for validating lookup calls before execution and catching errors early.
When to use
- Error Prevention: Validate lookup configurations before execution to catch missing or invalid dimensions early.
- Dynamic Validation: Check programmatically generated lookup calls for correctness in complex model workflows.
- Testing: Validate lookup configurations in unit tests without executing expensive lookup operations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
_dimensions
|
dict[str, str | Expr | ColumnProxy] | None
|
Optional dictionary mapping dimension names to columns/expressions |
None
|
**kwargs
|
Dimension name to column/expression mapping |
{}
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If dimension configuration is invalid |
Examples:¶
from gaspatchio_core.assumptions import Table
import polars as pl
data = pl.DataFrame({"age": [30, 35], "rate": [0.001, 0.002]})
table = Table("test", data, {"age": "age"}, "rate")
# Valid lookup - no error
table.validate_lookup(age="current_age")
# Invalid lookup - raises ValueError
try:
table.validate_lookup(invalid_dim="some_col")
except ValueError as e:
print(f"Validation error: {e}")
with_shock(shock, name=None)
¶
Apply a shock to create a modified copy of this table.
Creates a new Table with the shock applied to the value column. The original table is unchanged. This enables scenario analysis by creating stressed versions of assumption tables.
When to use
- Stress Testing: Create stressed assumption tables for regulatory capital calculations and risk analysis.
- Sensitivity Analysis: Generate tables with parameter variations to understand model sensitivity to assumptions.
- Ad-hoc Scenarios: Create one-off shocked tables without needing to load separate scenario files.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
shock
|
Shock
|
Shock specification to apply (Multiplicative, Additive, or Override) |
required |
name
|
str | None
|
Optional name for the shocked table (defaults to original_shocked) |
None
|
Returns:
| Type | Description |
|---|---|
Table
|
New Table with shocked values |
Examples:¶
Stress testing mortality:
```python no_output_check from gaspatchio_core.assumptions import Table from gaspatchio_core.scenarios.shocks import MultiplicativeShock import polars as pl
mortality_data = pl.DataFrame({"age": [30, 40], "qx": [0.001, 0.002]}) mortality = Table("mortality", mortality_data, {"age": "age"}, "qx")
Create 20% stressed version¶
shocked = mortality.with_shock(MultiplicativeShock(factor=1.2))
**Adding basis points to rates:**
```python no_output_check
from gaspatchio_core.assumptions import Table
from gaspatchio_core.scenarios.shocks import AdditiveShock
import polars as pl
rates_data = pl.DataFrame({"term": [1, 2], "rate": [0.05, 0.06]})
rates_table = Table("rates", rates_data, {"term": "term"}, "rate")
# Add 50bps to discount rates
stressed_rates = rates_table.with_shock(AdditiveShock(delta=0.005))
TableBuilder¶
gaspatchio_core.assumptions._builder.TableBuilder
¶
Fluent builder for complex table configurations.
build()
¶
Build the Table object from the configured builder.
Returns:
| Type | Description |
|---|---|
Table
|
Configured Table instance |
Raises:
| Type | Description |
|---|---|
ValueError
|
If source is not set or no dimensions are configured |
copy()
¶
Create a copy of this builder.
Returns:
| Type | Description |
|---|---|
TableBuilder
|
New TableBuilder with same configuration |
from_source(source)
¶
Set the data source for the table.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source
|
str | Path | DataFrame
|
Data source (file path or DataFrame) |
required |
Returns:
| Type | Description |
|---|---|
TableBuilder
|
Self for chaining |
reset()
¶
Reset the builder to initial state (keeping only the name).
Returns:
| Type | Description |
|---|---|
TableBuilder
|
Self for chaining |
with_categorical_dimension(name, value, dimension_name=None)
¶
Add a categorical dimension with a constant value.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Dimension name |
required |
value
|
Any
|
Constant value for this dimension |
required |
dimension_name
|
str | None
|
Optional custom name for the dimension column |
None
|
Returns:
| Type | Description |
|---|---|
TableBuilder
|
Self for chaining |
with_computed_dimension(name, expression, alias=None)
¶
Add a computed dimension from an expression.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Dimension name |
required |
expression
|
Expr
|
Polars expression to compute the dimension |
required |
alias
|
str | None
|
Optional alias for the computed column |
None
|
Returns:
| Type | Description |
|---|---|
TableBuilder
|
Self for chaining |
with_data_dimension(name, column, rename_to=None, dtype=None)
¶
Add a data dimension that maps directly from a column.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Dimension name |
required |
column
|
str
|
Source column name |
required |
rename_to
|
str | None
|
Optional rename for the dimension |
None
|
dtype
|
DataType | None
|
Optional data type conversion |
None
|
Returns:
| Type | Description |
|---|---|
TableBuilder
|
Self for chaining |
with_dimension(name, dimension)
¶
Add a pre-configured dimension object.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Dimension name |
required |
dimension
|
Dimension
|
Dimension object |
required |
Returns:
| Type | Description |
|---|---|
TableBuilder
|
Self for chaining |
with_melt_dimension(name, columns, overflow=None, fill=None)
¶
Add a melt dimension that transforms wide columns to long format.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Dimension name |
required |
columns
|
list[str]
|
List of columns to melt |
required |
overflow
|
Any | None
|
Optional overflow strategy |
None
|
fill
|
Any | None
|
Optional fill strategy |
None
|
Returns:
| Type | Description |
|---|---|
TableBuilder
|
Self for chaining |
with_value_column(name)
¶
Set the name of the value column.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Value column name |
required |
Returns:
| Type | Description |
|---|---|
TableBuilder
|
Self for chaining |
Registry Functions¶
list_tables¶
List all registered assumption tables.
Retrieves the names of all assumption tables that have been registered with the framework. Essential for model inventory management, debugging lookup failures, and ensuring all required tables are available before running actuarial projections or model validations.
When to use
- Model Validation: Check that all required assumption tables are loaded before starting model calculations or projections.
- Debugging: Troubleshoot lookup failures by verifying table registration status and identifying missing tables.
- Model Inventory: Generate reports of available assumption tables for model documentation and governance processes.
- Dynamic Configuration: Build dynamic model configurations that adapt based on available assumption tables.
Returns:
| Type | Description |
|---|---|
list[str]
|
list[str]: List of table names that have been registered |
Examples:¶
Scalar Example: Basic Table Listing
from gaspatchio_core.assumptions import Table, list_tables
import polars as pl
# Register some assumption tables
mortality_data = pl.DataFrame(
{"age": [30, 40, 50], "mortality_rate": [0.001, 0.004, 0.015]}
)
lapse_data = pl.DataFrame({"duration": [1, 2, 3], "lapse_rate": [0.05, 0.08, 0.12]})
Table(
name="mortality_list_ex",
source=mortality_data,
dimensions={"age": "age"},
value="mortality_rate",
)
Table(
name="lapse_list_ex",
source=lapse_data,
dimensions={"duration": "duration"},
value="lapse_rate",
)
# Check that tables were registered
tables = list_tables()
print("mortality_list_ex registered:", "mortality_list_ex" in tables)
print("lapse_list_ex registered:", "lapse_list_ex" in tables)
mortality_list_ex registered: True
lapse_list_ex registered: True
Vector Example: Model Validation Workflow
from gaspatchio_core.assumptions import Table, list_tables
import polars as pl
# Define required tables for a term life model
required_tables = [
"mortality_validation_ex",
"lapse_validation_ex",
"expense_validation_ex",
"interest_validation_ex"
]
# Register some tables (simulating partial loading)
mortality_data = pl.DataFrame({
"age": [25, 30, 35, 40],
"rate": [0.0008, 0.001, 0.0015, 0.0025]
})
lapse_data = pl.DataFrame({
"duration": [1, 2, 3, 4],
"rate": [0.05, 0.08, 0.10, 0.12]
})
Table(name="mortality_validation_ex", source=mortality_data,
dimensions={"age": "age"}, value="rate")
Table(name="lapse_validation_ex", source=lapse_data,
dimensions={"duration": "duration"}, value="rate")
# Validate model readiness
available_tables = list_tables()
missing_tables = [table for table in required_tables
if table not in available_tables]
print("Loaded tables:", ["mortality_validation_ex", "lapse_validation_ex"])
print("Missing tables:", missing_tables)
print(f"⚠️ Model not ready - missing {len(missing_tables)} tables")
Loaded tables: ['mortality_validation_ex', 'lapse_validation_ex']
Missing tables: ['expense_validation_ex', 'interest_validation_ex']
⚠️ Model not ready - missing 2 tables
list_tables_with_metadata¶
List all assumption tables that have metadata stored.
Returns a dictionary mapping table names to their stored metadata for all tables that were registered with metadata. Useful for generating comprehensive model documentation, conducting data lineage analysis, and ensuring proper governance over assumption tables used in actuarial models.
When to use
- Documentation Generation: Create comprehensive model documentation showing all assumption tables with their descriptions and sources.
- Governance Reporting: Generate reports for regulatory compliance showing data lineage, validation status, and review dates.
- Quality Assurance: Identify tables missing critical metadata like effective dates, validation status, or business descriptions.
- Model Inventory: Maintain centralized inventory of all assumption tables with their business context and technical specifications.
Returns:
| Name | Type | Description |
|---|---|---|
dict |
dict[str, dict[str, Any]]
|
Dictionary mapping table names to their metadata |
Examples:¶
Scalar Example: Basic Metadata Listing
from gaspatchio_core.assumptions import Table, list_tables_with_metadata
import polars as pl
# Register tables with rich metadata
mortality_data = pl.DataFrame({"age": [30, 40, 50], "rate": [0.001, 0.004, 0.015]})
Table(
name="mortality_meta_ex1",
source=mortality_data,
dimensions={"age": "age"},
value="rate",
metadata={
"description": "Base mortality rates for healthy lives",
"source": "Company Experience Study 2023",
"effective_date": "2024-01-01",
},
)
# Check table has metadata
tables_metadata = list_tables_with_metadata()
print(f"mortality_meta_ex1 has metadata: {'mortality_meta_ex1' in tables_metadata}")
mortality_meta_ex1 has metadata: True
Vector Example: Model Documentation Report
from gaspatchio_core.assumptions import Table, list_tables_with_metadata
import polars as pl
# Register multiple tables with metadata
mortality_df = pl.DataFrame({
"age": [30, 35, 40],
"rate": [0.001, 0.002, 0.004]
})
lapse_df = pl.DataFrame({
"duration": [1, 2, 3],
"rate": [0.05, 0.08, 0.12]
})
# Create tables with metadata
Table(
name="mortality_meta_ex2",
source=mortality_df,
dimensions={"age": "age"},
value="rate",
metadata={"source": "2017 CSO", "version": "v2.1"}
)
Table(
name="lapse_meta_ex2",
source=lapse_df,
dimensions={"duration": "duration"},
value="rate",
metadata={"source": "Company Study", "quality": "High"}
)
# Check tables with metadata
tables_meta = list_tables_with_metadata()
has_mortality = "mortality_meta_ex2" in tables_meta
has_lapse = "lapse_meta_ex2" in tables_meta
print("Found 2 tables with metadata")
print(f"mortality_meta_ex2 registered: {has_mortality}")
print(f"lapse_meta_ex2 registered: {has_lapse}")
Found 2 tables with metadata
mortality_meta_ex2 registered: True
lapse_meta_ex2 registered: True
get_table_metadata¶
Retrieve metadata for a registered assumption table.
Fetches stored metadata for an assumption table that was registered with the framework. Metadata includes information like table descriptions, data sources, validation rules, effective dates, and business context that actuaries need for model documentation and compliance reporting.
When to use
- Model Documentation: Retrieve table descriptions, sources, and business context for automated model documentation generation.
- Audit Trails: Access metadata for regulatory compliance and audit trails showing table lineage and validation status.
- Data Validation: Check table metadata before performing lookups to ensure data quality and appropriateness for calculations.
- Model Versioning: Track assumption table versions and effective dates for model change management and rollback procedures.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
table_name
|
str
|
Name of the table to get metadata for |
required |
Returns:
| Type | Description |
|---|---|
dict[str, Any] | None
|
dict | None: Copy of metadata dictionary if found, None otherwise |
Examples:¶
Scalar Example: Basic Metadata Retrieval
from gaspatchio_core.assumptions import Table, get_table_metadata
import polars as pl
# Create and register a mortality table with metadata
mortality_data = pl.DataFrame(
{
"age": [30, 35, 40, 45, 50],
"mortality_rate": [0.001, 0.002, 0.004, 0.008, 0.015],
}
)
mortality_table = Table(
name="mortality_2023",
source=mortality_data,
dimensions={"age": "age"},
value="mortality_rate",
metadata={
"description": "Standard mortality rates for term life insurance",
"source": "Industry Standard Tables 2023",
"effective_date": "2023-01-01",
"validation_status": "approved",
},
)
# Retrieve metadata
metadata = get_table_metadata("mortality_2023")
print(metadata)
{'description': 'Standard mortality rates for term life insurance', 'source': 'Industry Standard Tables 2023', 'effective_date': '2023-01-01', 'validation_status': 'approved'}
Vector Example: Metadata for Model Documentation
from gaspatchio_core.assumptions import Table, get_table_metadata
import polars as pl
# Create multiple assumption tables with rich metadata
tables_config = [
{
"name": "lapse_rates_term",
"data": pl.DataFrame({
"duration": [1, 2, 3, 4, 5],
"lapse_rate": [0.05, 0.08, 0.12, 0.15, 0.18]
}),
"metadata": {
"description": "Lapse rates for term life products",
"business_unit": "Individual Life",
"last_updated": "2023-12-01",
"data_quality": "high"
}
},
{
"name": "expense_rates",
"data": pl.DataFrame({
"year": [1, 2, 3],
"expense_rate": [150.0, 25.0, 15.0]
}),
"metadata": {
"description": "Annual expense rates per policy",
"currency": "USD",
"inflation_adjusted": True,
"review_frequency": "quarterly"
}
}
]
# Register lapse rates table
Table(
name="lapse_rates_term",
source=tables_config[0]["data"],
dimensions={"duration": "duration"},
value="lapse_rate",
metadata=tables_config[0]["metadata"]
)
# Register expense rates table
Table(
name="expense_rates",
source=tables_config[1]["data"],
dimensions={"year": "year"},
value="expense_rate",
metadata=tables_config[1]["metadata"]
)
# Check metadata count
print(f"Registered {len([get_table_metadata('lapse_rates_term'), get_table_metadata('expense_rates')])} tables with metadata")
Registered 2 tables with metadata