Skip to content

Extending the Rust Core

Gaspatchio's performance comes from its Rust core, which uses Polars expressions for vectorized operations. This document explains how to extend the core with custom Rust functions.

Architecture Overview

┌─────────────────────────────────────────────────────────────┐
│  Python Layer (gaspatchio_core)                             │
│  ┌─────────────────────────────────────────────────────┐   │
│  │  Accessor Methods (.excel, .finance, .projection)    │   │
│  │  └─> Call register_plugin_function()                 │   │
│  └─────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────────┐
│  Polars Plugin System                                       │
│  └─> Routes to compiled Rust function via #[polars_expr]   │
└─────────────────────────────────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────────┐
│  Rust Core (gaspatchio-core/core)                          │
│  ┌─────────────────────────────────────────────────────┐   │
│  │  Pure Rust functions operating on Polars Series      │   │
│  └─────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────┘

Why Rust?

  • Performance: Vectorized operations on millions of rows
  • Memory efficiency: Polars' streaming engine for bounded memory
  • Type safety: Compile-time guarantees
  • Parallelism: Automatic multi-threading via Polars

Example: Excel IRR Function

The Excel IRR() function is implemented as a Rust plugin. Here's how it works:

1. Pure Rust Implementation

// gaspatchio-core/core/src/excel_functions/irr.rs
use polars::prelude::*;

pub fn irr(inputs: &[Series], kwargs: &IrrKwargs) -> PolarsResult<Series> {
    let cash_flows = inputs[0].list()?;
    let guess = kwargs.guess;

    // Newton-Raphson iteration for IRR calculation
    let result: Float64Chunked = cash_flows
        .iter()
        .map(|opt_series| {
            opt_series.map(|s| calculate_irr(&s, guess))
        })
        .collect();

    Ok(result.into_series())
}

2. PyO3 Binding Layer

// gaspatchio-core/bindings/python/src/excel.rs
use pyo3_polars::derive::polars_expr;

#[polars_expr(output_type=Float64)]
pub fn irr(inputs: &[Series], kwargs: IrrKwargs) -> PolarsResult<Series> {
    gaspatchio_core_lib::excel_functions::irr(inputs, &kwargs)
}

3. Python Registration

# gaspatchio_core/accessors/excel_functions/irr.py
from polars.plugins import register_plugin_function
from gaspatchio_core import _internal

def irr(values: pl.Expr, guess: float = 0.1) -> pl.Expr:
    """Calculate Internal Rate of Return."""
    return register_plugin_function(
        args=[values],
        plugin_path=_internal.LIB,  # Path to compiled Rust library
        function_name="irr",
        kwargs={"guess": guess},
        is_elementwise=True,
    )

4. Accessor Integration

# gaspatchio_core/accessors/excel.py
from .excel_functions.irr import irr as _irr

@register_accessor("excel", kind="column")
class ExcelColumnAccessor(BaseColumnAccessor):

    def irr(self, guess: float = 0.1) -> "ExpressionProxy":
        """Calculate IRR for cash flow lists."""
        expr = _irr(self._get_polars_expr(), guess=guess)
        return ExpressionProxy(expr, self._get_parent_frame())

5. User-Facing API

from gaspatchio_core import ActuarialFrame

af = ActuarialFrame({
    "policy_id": ["P001", "P002"],
    "cash_flows": [[-1000, 300, 400, 500], [-2000, 800, 900, 1000]]
})

# Clean, discoverable API
af.rate_of_return = af.cash_flows.excel.irr(guess=0.1)

Core Functions Available

The Rust core provides these function categories:

Category Functions Used By
Vector fill_series, list_pow, list_clip, list_conditional Projections, accumulation
Excel irr, yearfrac, days360 Excel compatibility
Finance pv, fv, pmt, nper Financial calculations

When to Use Rust vs Python

Use Case Recommendation
Element-wise on millions of rows Rust - orders of magnitude faster
Complex actuarial formulas Rust - performance critical
One-off calculations Python .apply() is fine
Prototyping Python first, optimize to Rust later
List operations (projections) Rust - list_* functions exist for this

Adding New Core Functions

Adding a new Rust function requires changes in three places:

Step 1: Rust Core Function

// core/src/polars_functions/my_function.rs
use polars::prelude::*;

pub fn my_function(inputs: &[Series], kwargs: &MyKwargs) -> PolarsResult<Series> {
    // Implement logic using Polars primitives
}

Step 2: PyO3 Binding

// bindings/python/src/vector.rs (or new file)
#[polars_expr(output_type_func = my_output_type)]
pub fn my_function(inputs: &[Series], kwargs: MyKwargs) -> PolarsResult<Series> {
    gaspatchio_core_lib::polars_functions::my_function(inputs, &kwargs)
}

Step 3: Python Registration

# gaspatchio_core/functions/my_function.py
from polars.plugins import register_plugin_function

def my_function(expr: pl.Expr, **kwargs) -> pl.Expr:
    return register_plugin_function(
        args=[expr],
        plugin_path=LIB,
        function_name="my_function",
        kwargs=kwargs,
    )

Testing Core Functions

# Rust unit tests
cd gaspatchio-core/core && cargo test

# Rust benchmarks
cd gaspatchio-core/core && cargo bench

# Python integration tests
cd gaspatchio-core/bindings/python && uv run pytest -v

Summary

Gaspatchio's Rust core provides:

  • Polars plugin system for registering high-performance functions
  • Type-safe Rust implementation for actuarial calculations
  • Clean Python API via accessors that hide the complexity
  • Extensibility - add new functions following the established pattern

For most users, the Python accessor APIs (.excel, .finance, .projection) are all you need. The Rust core is there for when you need maximum performance or want to contribute new functions to the framework.