Docs > Validation & benchmarks > Comparison criteria

Comparison criteria

1 Overview

Every validation and benchmark test compares an Ashes output against a reference time series — an analytical solution, a result produced by another tool, or a previous Ashes run. For each compared variable, Ashes reduces the two time series to a single scalar, norm, and the variable is reported as PASS when that scalar is at or below the threshold configured for the test:

$$\text{norm}\le\text{threshold}\;\Rightarrow\;\text{PASS}$$

The way the two time series are reduced to that single scalar is governed by the comparison criterion assigned to the variable. The criterion is an integer chosen per variable in the test's batch input. Different criteria are appropriate for different kinds of response — a static end state, a free-vibration decay, a forced steady-state oscillation, and so on. This page describes each available criterion.

Note: when no explicit threshold is set, the test is comparing Ashes against its own previous run and a very strict default threshold is used, so even small numerical changes are flagged.

2 The relative norm

Most criteria are built on a relative norm between a reference value

$$a$$

and the compared value

$$b$$

, defined as the absolute difference normalised by the magnitude of one of the values:

$$\text{norm}=\frac{\left|b-a\right|}{\left|b\right|}$$

Two safeguards protect this norm near zero, where dividing by a vanishing value would otherwise produce meaningless results:

A noise limit of
$$10^{-3}$$
: values whose magnitude falls below it are treated as zero, and two values both below it are considered equal.
When a value is small relative to the maximum reached by the series, the difference is normalised by that series maximum
$$c$$
instead of by the value itself, i.e. |b−a|/|c|, which keeps the norm bounded on near-zero samples.

For each variable the criterion evaluates this norm (or a feature-based comparison, for the decay and cyclic criteria below) at the relevant points of the series and keeps the largest value found as the reported maximum relative norm.

3 Available criteria

3.1 Pointwise and whole-signal criteria

These criteria compare the two series value by value.

Fixed norm. The relative norm of the variable is set to a constant value of 1, independent of the data, bypassing any data-driven comparison. It is used for channels that are not meant to be evaluated automatically against a threshold (for example, some OC3 channels), where agreement is assessed separately. The resulting PASS/FAIL therefore depends entirely on the threshold configured for that channel.
Whole time series, pointwise. Walks the entire time series and keeps the largest pointwise relative norm between the Ashes and reference values at each time step (interpolating the reference onto the Ashes time stamps when the two use different steps), with the noise-limit handling described above. This is the most general and strictest criterion.
Last fifth, pointwise. The same pointwise comparison as criterion 1, but restricted to the last 20% of the simulation. Used when the start of the run contains a transient or ramp-up that should not be compared, and only the settled portion is of interest.
Last fifth, normalised by the maximum. Over the last fifth of the run, takes the difference normalised by the series maximum, skipping points where either series is below 1% of that maximum or below the noise limit, and stopping at the end of the reference series if the Ashes run is longer. Compared with criterion 2 it normalises by the global maximum rather than the local value, which avoids large norms on small-magnitude samples.
Last value only. Compares just the final value of each series. Typical for static analyses, where only the converged end state matters.

Note: when no criterion at all is assigned to a variable, Ashes falls back to a plain pointwise comparison of the whole series (without the noise-limit handling of criterion 1). This is the mode used when comparing an Ashes run against its own previous result.

3.2 Decay (free-vibration) criteria

These criteria are intended for free-vibration / decay tests. They locate the maxima (the peaks lying above the mean) of each series and compare features of those peaks rather than the instantaneous values, which makes them tolerant of a phase offset between the two responses.

Decay, peaks and period (whole series). Compares both the heights of successive maxima and the average period between maxima, over the whole run.
Decay, period only. Compares only the average period between maxima, over the whole run. Peak heights are ignored, so it is used when only the oscillation frequency matters and the amplitude is allowed to differ.
Decay, peaks and period (last fifth). The same comparison as criterion 3 (peak heights and average period) but restricted to the last fifth of the run, so an initial transient is excluded.

3.3 Steady-state cyclic criterion

Steady-state cyclic comparison. Intended for forced, steady-state cyclic responses, where the two series settle into the same oscillation but may differ by a small phase lead or lag and by a different transient first cycle. A pointwise criterion (1, 2 or 5) fails on these signals even when they agree well, because on the steep flanks of the cycle a small time offset produces a large pointwise difference, and a differing first cycle poisons any whole-series comparison.

Criterion 8 sidesteps both problems. It takes the last half of the run, so the transient is removed, and reduces each series to three phase-independent features that are then compared with the relative norm:

the maximum of the steady-state window (peak amplitude);
the minimum of the steady-state window (trough amplitude);
the dominant period, measured from the average spacing between successive upward crossings of the mean.

The period is taken from mean crossings rather than from peak detection on purpose: the crossings fall on the steep flanks of the cycle, where there is no high-frequency wiggle, so the period is robust to small oscillations superimposed on the peaks and troughs. The reported norm is the worst (largest) of the three feature comparisons.

Note: because criterion 8 contains no phase term, a lead or lag between the two series does not cause a failure — a signal with matching amplitude and period passes regardless of phase. By the same token, it does not detect a difference confined to the discarded first half of the run (for example, a different transient).