Date API¶
Frame-Level Operations¶
gaspatchio_core.accessors.date.DateFrameAccessor
¶
Bases: BaseFrameAccessor
Provides date-related methods applicable to the entire ActuarialFrame.
Accessed via .date on an ActuarialFrame instance,
e.g., af.date.
This accessor allows for complex date manipulations at the frame level, such as generating timelines for projections or adding durations to multiple date columns simultaneously. It integrates with Polars expressions for optimized performance.
add_duration(date_col, duration_str, new_col_name=None)
¶
Adds a duration string (e.g., '1Y', '3M', '-7d') to a date column.
This function leverages Polars' powerful duration arithmetic to efficiently
modify dates within the ActuarialFrame. It can create a new column with
the resulting dates or modify an existing column if new_col_name is not
provided and date_col is a string name.
When to use
- Date Arithmetic: Use this method to shift dates by a fixed duration, such as calculating a policy anniversary, determining a future maturity date, or finding a past event date. It's particularly useful for batch operations on an entire column of dates.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
date_col
|
IntoExprColumn
|
The column containing the dates to add the duration to. |
required |
duration_str
|
str
|
The duration string in Polars format (e.g., "1Y6M", "-3d12h"). |
required |
new_col_name
|
str | None
|
The name for the new column containing the resulting dates. If None, modifies the original column (if it's a string name). |
None
|
Returns:
| Type | Description |
|---|---|
ActuarialFrame
|
A new ActuarialFrame with the added/modified column. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If date_col is not a valid column/expression or if modification is attempted without providing a string name for date_col. |
ComputeError
|
If the duration addition fails (e.g., invalid duration string, incompatible date types). |
Examples:
import datetime
from gaspatchio_core import ActuarialFrame
data = {
"event_date": [datetime.date(2023, 1, 15), datetime.date(2023, 6, 30)],
"term_months": [6, 12],
}
af = ActuarialFrame(data)
af_plus_1y = af.date.add_duration(
af.event_date, "1y", new_col_name="event_plus_1y"
)
print(af_plus_1y.collect())
shape: (2, 3)
┌────────────┬─────────────┬───────────────┐
│ event_date ┆ term_months ┆ event_plus_1y │
│ --- ┆ --- ┆ --- │
│ date ┆ i64 ┆ date │
╞════════════╪═════════════╪═══════════════╡
│ 2023-01-15 ┆ 6 ┆ 2024-01-15 │
│ 2023-06-30 ┆ 12 ┆ 2024-06-30 │
└────────────┴─────────────┴───────────────┘
import datetime
from gaspatchio_core import ActuarialFrame
data = {
"event_date": [datetime.date(2023, 1, 15), datetime.date(2023, 6, 30)],
"term_months": [6, 12]
}
af = ActuarialFrame(data)
af_minus_3m = af.date.add_duration(af.event_date, "-3mo", new_col_name="event_minus_3m")
print(af_minus_3m.collect())
shape: (2, 3)
┌────────────┬─────────────┬────────────────┐
│ event_date ┆ term_months ┆ event_minus_3m │
│ --- ┆ --- ┆ --- │
│ date ┆ i64 ┆ date │
╞════════════╪═════════════╪════════════════╡
│ 2023-01-15 ┆ 6 ┆ 2022-10-15 │
│ 2023-06-30 ┆ 12 ┆ 2023-03-30 │
└────────────┴─────────────┴────────────────┘
create_timeline(start_col, end_col, freq='1d', new_col_name='timeline_date', closed='left')
¶
Creates timeline columns based on start and end dates.
Generates a list of dates for each row based on its start and end date, using the specified frequency. The result is exploded to create a longer DataFrame where each original row is repeated for each date in its timeline.
When to use
- Period-to-Event Transformation: This method is useful when you need to transform row-per-period data (where each row has a start and end date) into row-per-event data (where each row represents a specific point in time, like a month-end). For example, to calculate monthly exposures from policy start/end dates.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
start_col
|
IntoExprColumn
|
Column or expression for the start date of the interval. |
required |
end_col
|
IntoExprColumn
|
Column or expression for the end date of the interval. |
required |
freq
|
str
|
The frequency of the timeline (e.g., "1M", "1Y", "1d").
Passed to |
'1d'
|
new_col_name
|
str
|
Name for the new column containing the generated timeline dates. Defaults to "timeline_date". |
'timeline_date'
|
closed
|
str
|
Which side of the interval is closed ("left", "right", "both", "none").
Passed to |
'left'
|
Returns:
| Type | Description |
|---|---|
ActuarialFrame
|
A new ActuarialFrame instance with the original data expanded |
ActuarialFrame
|
by the generated timeline dates. |
Raises:
| Type | Description |
|---|---|
ColumnNotFoundError
|
If start_col or end_col cannot be resolved. |
ComputeError
|
If date range generation fails (e.g., invalid freq, incompatible date types). |
Examples:
import datetime
from gaspatchio_core import ActuarialFrame
data = {
"policy_id": [1, 2],
"start_date": [datetime.date(2023, 1, 1), datetime.date(2023, 2, 15)],
"end_date": [datetime.date(2023, 3, 1), datetime.date(2023, 4, 15)],
}
af = ActuarialFrame(data)
timeline_af = af.date.create_timeline(
af.start_date, af.end_date, freq="1mo", new_col_name="month_end"
)
print(timeline_af.collect())
shape: (4, 4)
┌───────────┬────────────┬────────────┬────────────┐
│ policy_id ┆ start_date ┆ end_date ┆ month_end │
│ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ date ┆ date ┆ date │
╞═══════════╪════════════╪════════════╪════════════╡
│ 1 ┆ 2023-01-01 ┆ 2023-03-01 ┆ 2023-01-01 │
│ 1 ┆ 2023-01-01 ┆ 2023-03-01 ┆ 2023-02-01 │
│ 2 ┆ 2023-02-15 ┆ 2023-04-15 ┆ 2023-02-15 │
│ 2 ┆ 2023-02-15 ┆ 2023-04-15 ┆ 2023-03-15 │
└───────────┴────────────┴────────────┴────────────┘
Column-Level Operations¶
gaspatchio_core.accessors.date.DateColumnAccessor
¶
Bases: BaseColumnAccessor
Provides date-related methods for ColumnProxy or ExpressionProxy objects.
Accessed via .date on a column or expression,
e.g., af["my_date_col"].date.
This accessor offers convenient methods to manipulate and extract information from date/datetime columns within Polars expressions.
months_between(other)
¶
Calculate the number of whole months between two dates.
Computes (year2 - year1) * 12 + (month2 - month1) where
self is the start date and other is the end date.
Returns a positive integer when other is after self.
This is the standard actuarial duration calculation used for policy duration in months, time-to-maturity, and assumption table key derivation.
When to use
- Policy Duration: Calculate months since issue for use as an assumption lookup key (mortality select period, surrender charge schedule, commission clawback period).
- Time to Maturity: Compute remaining term in months for
each policy to determine the projection horizon or the
in_boundarymask for IFRS 17 contract boundary. - Cohort Assignment: Derive issue quarter or issue year-month for grouping policies into measurement cohorts.
Parameters¶
other : ColumnProxy | ExpressionProxy | datetime.date
The end date. Can be a column reference (per-policy valuation
dates), an expression, or a fixed datetime.date literal
(single valuation date for the entire portfolio).
Returns¶
ExpressionProxy Integer number of whole months between the dates.
Examples¶
Duration from issue date to a fixed valuation date
import datetime
from gaspatchio_core import ActuarialFrame
af = ActuarialFrame(
{
"policy_id": ["P001", "P002", "P003"],
"issue_date": [
datetime.date(2020, 3, 15),
datetime.date(2018, 11, 1),
datetime.date(2023, 7, 20),
],
}
)
af.duration_months = af.issue_date.date.months_between(
datetime.date(2025, 1, 1)
)
print(af.collect())
shape: (3, 3)
┌───────────┬────────────┬─────────────────┐
│ policy_id ┆ issue_date ┆ duration_months │
│ --- ┆ --- ┆ --- │
│ str ┆ date ┆ i32 │
╞═══════════╪════════════╪═════════════════╡
│ P001 ┆ 2020-03-15 ┆ 58 │
│ P002 ┆ 2018-11-01 ┆ 74 │
│ P003 ┆ 2023-07-20 ┆ 18 │
└───────────┴────────────┴─────────────────┘
Notes¶
- Counts whole calendar months, ignoring the day component. A policy issued on March 31 and valued on April 1 gives 1 month.
- Negative values indicate
otheris beforeself. - For sub-monthly precision, use
date.year_frac()instead.
See Also¶
to_period : Truncate dates to period boundaries (month, quarter, year)
to_period(freq='M')
¶
Converts a date/datetime column to a period representation (e.g., year-month).
This is useful for grouping or aggregating data by specific time periods like month, quarter, or year. It truncates the date to the beginning of the specified period.
When to use
- Period Aggregation: Use this to aggregate daily or weekly data into monthly, quarterly, or annual summaries.
- Time Series Features: For creating features for time series models based on periods.
- Date Alignment: When you need to align dates to a common period start (e.g., all dates in January
2023 become 2023-01-01 if
freq="M").
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
freq
|
str
|
The frequency string for period conversion (e.g., "M", "Q", "Y").
See Polars documentation for |
'M'
|
Returns:
| Type | Description |
|---|---|
ExpressionProxy
|
An |
ExpressionProxy
|
specified period. |
Examples:
import datetime
import polars as pl
from gaspatchio_core import ActuarialFrame
data = {
"event_timestamp": [
datetime.datetime(2023, 1, 15, 10, 30, 0),
datetime.datetime(2023, 1, 20, 14, 0, 0),
datetime.datetime(2023, 2, 5, 8, 0, 0),
]
}
af = ActuarialFrame(data)
af.month = af.event_timestamp.dt.truncate("1mo").cast(pl.Date)
print(af.collect())
shape: (3, 2)
┌─────────────────────┬────────────┐
│ event_timestamp ┆ month │
│ --- ┆ --- │
│ datetime[μs] ┆ date │
╞═════════════════════╪════════════╡
│ 2023-01-15 10:30:00 ┆ 2023-01-01 │
│ 2023-01-20 14:00:00 ┆ 2023-01-01 │
│ 2023-02-05 08:00:00 ┆ 2023-02-01 │
└─────────────────────┴────────────┘
import datetime
import polars as pl
from gaspatchio_core import ActuarialFrame
data = {
"event_timestamp": [
datetime.datetime(2023, 1, 15, 10, 30, 0),
datetime.datetime(2023, 1, 20, 14, 0, 0),
datetime.datetime(2023, 2, 5, 8, 0, 0),
]
}
af = ActuarialFrame(data)
af.year = af.event_timestamp.dt.truncate("1y").cast(pl.Date)
print(af.collect())
shape: (3, 2)
┌─────────────────────┬────────────┐
│ event_timestamp ┆ year │
│ --- ┆ --- │
│ datetime[μs] ┆ date │
╞═════════════════════╪════════════╡
│ 2023-01-15 10:30:00 ┆ 2023-01-01 │
│ 2023-01-20 14:00:00 ┆ 2023-01-01 │
│ 2023-02-05 08:00:00 ┆ 2023-01-01 │
└─────────────────────┴────────────┘
Datetime Namespace¶
For Polars-native datetime operations (year, month, day extraction), use the .dt namespace directly:
gaspatchio_core.column.namespaces.dt_proxy.DtNamespaceProxy
¶
A proxy for Polars datetime (dt) namespace operations.
Enables type-hinting and IDE intellisense for ActuarialFrame datetime
manipulations.
This proxy intercepts calls to datetime methods, retrieves the underlying
Polars expression from its parent proxy (either a ColumnProxy or
ExpressionProxy), applies the datetime operation, and then wraps the
resulting Polars expression back into an ExpressionProxy.
day()
¶
Extract the day number of the month (1-31) from a date/datetime expression.
This function isolates the day component from a date or datetime, returning it as an integer (e.g., 15 for the 15th of the month). It works for both individual dates and lists of dates.
When to use
Extracting the day of the month can be useful in actuarial contexts for:
- Specific Date Checks: Identifying events occurring on particular days (e.g., end-of-month processing).
- Intra-month Analysis: Analyzing patterns within a month, though less common than month or year analysis.
- Data Validation: Ensuring dates fall within expected day ranges for specific calculations.
Examples¶
Scalar example::
import polars as pl
from gaspatchio_core import ActuarialFrame
af = ActuarialFrame(
{"d": pl.Series(["2023-06-05", "2023-06-15"]).str.to_date()}
)
print(af.select(af.d.dt.day().alias("day")).collect())
shape: (2, 1)
┌─────┐
│ day │
│ --- │
│ i8 │
╞═════╡
│ 5 │
│ 15 │
└─────┘
Vector (list) example - loss-event days::
import datetime
import polars as pl
from gaspatchio_core import ActuarialFrame
data = {
"policy_id": ["E005", "F006"],
"loss_event_dates": [
[datetime.date(2023, 6, 5), datetime.date(2023, 6, 15)],
[datetime.date(2024, 2, 1), datetime.date(2024, 2, 29)],
],
}
af = ActuarialFrame(data).with_columns(
pl.col("loss_event_dates").cast(pl.List(pl.Date))
)
days_expr = af.loss_event_dates.dt.day()
print(af.select("policy_id", days_expr.alias("event_days")).collect())
shape: (2, 2)
┌───────────┬────────────┐
│ literal ┆ event_days │
│ --- ┆ --- │
│ str ┆ list[i8] │
╞═══════════╪════════════╡
│ policy_id ┆ [5, 15] │
│ policy_id ┆ [1, 29] │
└───────────┴────────────┘
month()
¶
Extract the month number (1-12) from a date or datetime expression.
This function allows you to isolate the month component from a series of dates or datetimes. The result is an integer representing the month, where January is 1 and December is 12.
When to use
In actuarial modeling, extracting the month from dates is crucial for various analyses. For instance, you might use this to:
- Analyze seasonality in claims (e.g., identifying if certain types of claims are more frequent in specific months).
- Group policies by their issue month for cohort analysis or to study underwriting patterns.
- Determine premium due dates or benefit payment schedules that occur on a monthly basis.
- Calculate fractional year components for financial calculations.
Examples¶
Scalar example::
import polars as pl
from gaspatchio_core import ActuarialFrame
af = ActuarialFrame(
{
"d": pl.Series(["2022-01-01", "2022-02-01", "2022-03-01"]).str.to_date(
"%Y-%m-%d"
)
}
)
print(af.select(af.d.dt.month().alias("m")).collect())
shape: (3, 1)
┌─────┐
│ m │
│ --- │
│ i8 │
╞═════╡
│ 1 │
│ 2 │
│ 3 │
└─────┘
Vector (list) example - claim-lodgement months::
import datetime
import polars as pl
from gaspatchio_core import ActuarialFrame
data = {
"policy_id": ["C003", "D004"],
"claim_lodgement_dates": [
[datetime.date(2022, 3, 10), datetime.date(2022, 4, 5)],
[datetime.date(2023, 1, 20), datetime.date(2023, 11, 30)],
],
}
af = ActuarialFrame(data).with_columns(
pl.col("claim_lodgement_dates").cast(pl.List(pl.Date))
)
months_expr = af.claim_lodgement_dates.dt.month()
result = af.select(
pl.col("policy_id"), months_expr.alias("lodgement_months")
)
print(result.collect())
shape: (2, 2)
┌───────────┬──────────────────┐
│ policy_id ┆ lodgement_months │
│ --- ┆ --- │
│ str ┆ list[i8] │
╞═══════════╪══════════════════╡
│ C003 ┆ [3, 4] │
│ D004 ┆ [1, 11] │
└───────────┴──────────────────┘
year()
¶
Extract the year from the underlying datetime expression.
This function isolates the year component from a date or datetime,
returning it as an integer (e.g., 2023). It is applicable to both
single date values and lists of dates within your ActuarialFrame.
When to use
Extracting the year is fundamental in actuarial analysis for:
- Valuation and Reporting: Determining the calendar year for financial reporting or regulatory submissions.
- Experience Studies: Grouping data by calendar year of event (e.g., year of claim, year of lapse) to analyze trends.
- Cohort Analysis: Defining cohorts based on the year of policy issue or birth year.
- Projection Models: Calculating durations or projecting cash flows based on calendar years.
Examples¶
Scalar example (single-date column)::
import polars as pl
from gaspatchio_core import ActuarialFrame
data = {
"dates": pl.Series(["2020-01-15", "2021-07-20"]).str.to_date(
format="%Y-%m-%d"
)
}
af = ActuarialFrame(data)
year_expr = af.dates.dt.year()
print(af.select(year_expr.alias("year")).collect())
shape: (2, 1)
┌──────┐
│ year │
│ --- │
│ i32 │
╞══════╡
│ 2020 │
│ 2021 │
└──────┘
Vector example (list-of-dates per policy)::
import datetime
import polars as pl
from gaspatchio_core import ActuarialFrame
data_vec = {
"policy_id": ["A001", "B002"],
"policy_event_dates": [
[datetime.date(2019, 12, 1), datetime.date(2020, 1, 20)],
[
datetime.date(2021, 5, 10),
datetime.date(2021, 8, 15),
datetime.date(2022, 2, 25),
],
],
}
af_vec = ActuarialFrame(data_vec)
af_vec = af_vec.with_columns(
pl.col("policy_event_dates").cast(pl.List(pl.Date))
)
years_expr = af_vec.policy_event_dates.dt.year()
result = af_vec.select(
pl.col("policy_id"), years_expr.alias("event_years")
)
print(result.collect())
shape: (2, 2)
┌───────────┬────────────────────┐
│ policy_id ┆ event_years │
│ --- ┆ --- │
│ str ┆ list[i32] │
╞═══════════╪════════════════════╡
│ A001 ┆ [2019, 2020] │
│ B002 ┆ [2021, 2021, 2022] │
└───────────┴────────────────────┘