Understanding Family statistics

Family statistics compare signal behavior across a group of Runs, such as average pressure during firing or peak temperature spread across tests. At a high level, the pipeline involves finding the same event in each Run, cutting out the data around it, normalizing the samples to a common rate, and then computing a summary statistic across all those slices. To produce meaningful statistics, especially for use in Rules, you can view the process as a two-phase pipeline: alignments and dynamic windows select the slice of each Run to compare, and bucketing and aggregation collapse those slices into a single statistic series. This topic describes the core concepts and processes involved in producing meaningful Family statistics.

Glossary

Alignment: A reference event (such an Annotation, Run start/end, or fixed timestamp) used to normalize Runs to a common time axis.
Alignment point: A single timestamped occurrence of an alignment within a Run. A Run with three of the same event would have three alignment points.
Dynamic window: A time slice with bounds defined as alignment + offset, so it resolves to a different actual window per Run.
Dynamic window occurrence: A concrete time slice produced by resolving a dynamic window against one Run’s alignment points. A Run can produce multiple occurrences.
Family statistic: A time series summarizing a group of Runs, such as their mean signal or 3-sigma envelope.
Bucket: A fixed-width interval on the output rate’s grid that holds one normalized value per occurrence.
Aggregation: The per-bucket reduction across occurrences (avg, min, max, stdev) that produces the Family statistic.
input_count: A diagnostic series reporting how many occurrences contributed to each bucket; low values flag unreliable statistics.

How these concepts work together

Producing a Family statistic follows a fixed sequence:

Define an alignment (your reference event). Choose the event that marks the moment of interest across all Runs, such as valve open or engine startup, to anchor each Run on a common time axis.
Resolve alignment points (find every instance of that event). For each Run, every occurrence of the alignment event gets a timestamp. A Run with three valve openings produces three alignment points.
Define a dynamic window (the time slice you care about). Specify window start and end as an alignment event plus an optional offset. This scopes the comparison to the portion of each Run relevant to the event of interest.
Resolve dynamic window occurrences (one slice per event instance). The dynamic window is applied to each Run’s alignment points. Each valid combination of start and end alignment points produces one occurrence; a Run with multiple cycles contributes one occurrence per cycle.
Bucket each occurrence (normalize sample rates). A uniform bucket grid is applied to normalize occurrences sampled at different rates to a single value per bucket.
Aggregate across occurrences (compute the summary statistic). The selected aggregation (avg, min, max, stdev) is computed per bucket across all contributing occurrences, producing the final Family statistic series.
Check input_count (verify enough data contributed). Inspect the diagnostic series to identify buckets with sparse coverage, where spread statistics become unreliable.

Aligning Runs

To aggregate across Runs that occur at different absolute (UTC) times, each Run must first be normalized to a common time axis. An Alignment defines how. Aligning Runs alone is sufficient for plotting Family members and visually comparing their behavior, but statistical aggregation in Rules also requires scoping the comparison to the relevant duration. That scope is defined by a Dynamic Window. A Dynamic Window is defined by:

T-0 alignment: the anchor event the slice is centered on in relative time, such as a valve opening.
Window start and window end: the bounds of the slice, each defined as an alignment event with an optional offset.

Because dynamic windows are defined relative to events rather than absolute time, the resolved window differs for each Run.

Multiple occurrences per Run

The ability to set multiple alignment points within a single Run is in private preview and not yet available to all customers. Reach out to Sift support to get access.

A single Run can contain more than one event of interest. A valve, for example, may open and close several times in the same Run. Each event may be represented by a separate Alignment Point. For example, a “Valve Open” alignment defined by an Annotation that exists on each opening resolves to an alignment point for each opening. When a Dynamic Window’s bounds resolve against multiple alignment points, the window produces multiple dynamic window occurrences in the same Run, one per valid combination of bounds. Each occurrence contributes to the aggregate independently and is plotted as if it were a separate Run.

The remaining sections describe computation per occurrence. When a Run produces exactly one occurrence (the common case), “per occurrence” and “per Run” are equivalent. When a Run produces multiple occurrences, each is treated as an independent contributor.

Bucketing and aggregation

Family statistics are computed in two stages: each occurrence is bucketed to a common sample rate, then aggregated across occurrences, per bucket.

Stage 1: Bucketing per occurrence

The output rate determines a uniform bucket grid. Each occurrence is independently normalized to exactly one value per bucket, using one of three strategies based on its input rate relative to the output rate:

Forward fill is experimental and not enabled by default. Reach out to Sift support to have it enabled for your organization.

Bucketing exists to remove sample-rate bias from the aggregate. Without it, a Run sampled at 100 Hz contributes 100 samples per second of window while a Run sampled at 1 Hz contributes only 1. The Family statistic ends up weighted by sample rate rather than by Run, with the highest-rate contributors dominating. Bucketing normalizes every occurrence to a single value per bucket regardless of underlying sample rate, so each occurrence carries the same weight in the aggregate.

Stage 2: Aggregation across occurrences

With every occurrence contributing exactly one value per bucket, the selected aggregate (avg, min, max, stdev) is computed across occurrences, per bucket. The result is a single series at the output rate.

The bucket-level aggregate treats real samples, forward-filled values, and within-bucket means as equivalent contributions. The input_count series reports how many occurrences are contributing to each bucket. When input_count drops, statistics like stdev, min, and max become unreliable because too few occurrences are contributing. This is most common at the trailing or leading edges of dynamic windows, where occurrences end or start at different relative times.

Working with Family statistics in Rules

Choosing an output rate lower than the raw data rate smooths sensor noise before aggregation. The same approach can be applied to inputs within a Rule using the avg($1, bucket(1s)) CEL function, which averages the input signal into 1-second buckets before evaluation. Comparing bucketed inputs against bucketed Family statistics prevents sensor noise from triggering false Rule violations.
The Family statistics preview is the most direct way to see how alignments, dynamic windows, and bucketing interact for a given configuration.

Documentation Index

​Glossary

​How these concepts work together

​Aligning Runs

​Multiple occurrences per Run

​Bucketing and aggregation

​Stage 1: Bucketing per occurrence

​Stage 2: Aggregation across occurrences

​Working with Family statistics in Rules

Glossary

How these concepts work together

Aligning Runs

Multiple occurrences per Run

Bucketing and aggregation

Stage 1: Bucketing per occurrence

Stage 2: Aggregation across occurrences

Working with Family statistics in Rules