Anomaly Detection

Anomaly Detection provides a statistical method to determine how a given metric has changed in relation to previous data.

Anomaly Detection is a feature within Reports & Analytics that allows you to separate "true signals" from "noise" and then identify potential factors that contributed to those signals or anomalies. In other words, it lets you identify which statistical fluctuations matter and which don't. You can then identify the root cause of a true anomaly. Furthermore, you can get reliable metric (KPI) forecasts.

Note: Check out the new Anomaly Detection and Contribution Analysis features in Analysis Workspace!

Examples of anomalies you might investigate include:

The following table describes the prediction interval data returned by anomaly detection in the Reporting API:

upper_bound

Upper level of the prediction interval. Values above this level are considered anomalous.

Represents a 95% confidence that values will be below this level.

lower_bound

Lower level of the prediction interval. Values below this level are considered anomalous.

Represents a 95% confidence that values will be above this level.

forecast

The predicted value based on the data analysis. This value is also the middle point between the upper and lower bounds.

Anomaly Detection uses a training period to calculate, learn, and report prediction interval data per day. The training period is the historical period that identifies what is normal vs. anomalous, and applies what is learned to the reporting period. Depending on your configuration, the training period consists of the previous 30, 60, or 90 days before the start of the view or reporting period.

To calculate the data, the daily total for each metric is compared with the training period using each of the following algorithms:

· Holt Winters Multiplicative (Triple Exponential Smoothing)

· Holt Winters Additive (Triple Exponential Smoothing)

· Holts Trend Corrected (Double Exponential Smoothing)

Each algorithm is applied to determine the algorithm with the smallest Sum of Squared Errors (SSE). The Mean Absolute Percent Error (MAPE) and the current Standard Error are then calculated to make sure that the model is statistically valid.

These algorithms can be extended to provide predictive forecasts of metrics in future periods.

Because the training period varies based on the start of the view (reporting) period, you might see differences in the data reported for the same date as part of two different time periods.

For example, if you run a report for January 1-14, and then run a report for January 7-21, you might see different prediction data for the same metric between January 7-14 in the two different reports. This is a result of the difference in training periods.

View Period

30-day Training Period

60-day Training Period 90-day Training Period

January 1-14

November 27 - December 31

28 October - December 31 28 September - December 31

January 7-21

December 4 - January 6

November 4 - January 6 October 5 - January 6

Refer to instructions elsewhere on how to set up Anomaly Detection and run an Anomaly Detection report.

Anomaly detection is also available in the Reporting API.