From Pixels to Analytics: How SignalFx Handles Data Resolution

DZone 's Guide to

From Pixels to Analytics: How SignalFx Handles Data Resolution

Often we want to do analytics on data streams with different temporal resolution: data is collected at different intervals and on different schedules. We want the ratio of two numbers but one of the numbers is sampled every second and the other is sampled every two minutes. You need some smart interpolation. Here is one system that does this automatically for you.

· Performance Zone ·
Free Resource

When plotting a metric or evaluating an analytic expression, SignalFx automatically chooses the right resolution to use for the displayed data. This resolution—i.e. the time interval between successive data points—determines the precision of the visualization. This is different from the resolution (or frequency) of the data as submitted, which can represent intervals as small as one second.

SignalFx uses many factors to determine the display resolution, in order to present the highest level of detail in any given view while providing the best possible performance. These factors include the input resolution of all the metrics being plotted or analyzed, the number of pixels available, chart width, and more. 

The largest factor is the input resolution, the frequency with which data arrives at SignalFx. If a particular metric is submitted at a frequency of once per second, SignalFx will try to display or run analytics on it every second. All SignalFx detectors, for example, operate at the most detailed resolution possible given the constituent metrics being alerted on. 

The magic of SignalFx is in using analytics to get more out of your data than you put in. So, unsurprisingly, most of the charts, analytics, and alert detectors our users create involve more than one metric (sometimes thousands at once). And these separate metrics can sometimes be measured at different frequencies. 

For example, say we want to compute the ratio between two metrics that have different input resolutions: metric A is shipped once per second and metric B is shipped once per minute. In cases like these, SignalFx will choose the coarsest resolution in order to ‘line up’ the data points for both plotting and analytics purposes. The metric with the finer resolution will be aggregated appropriately to the minute boundary (e.g. counters will be summed) for accurate computation. This ensures that both metrics will have a data point available whenever output needs to be displayed or a condition needs to be alerted on.

timeshift is just this easy

Timeshift: It’s just this easy.

The display resolution of a metric, whether raw as sent in or derived as the result of an analytics pipeline, is also affected by its age. Users frequently use SignalFx’s timeshift function to compare a given metric to its own history: looking at changes hour-over-hour, day-over-day, week-over-week, and so on. SignalFx aggregates metrics to one-hour resolution once they are older than three months. A chart that compares a metric with input resolution of one second and the same metric timeshifted by one year will have a display resolution of one hour, the coarsest resolution of the two. 

minimum resolution settings menu

Users can view data at a finer resolution than SignalFx has selected by setting a minimum resolution for a chart. This sets the resolution lower than SignalFx’s default, for both displaying the graph and evaluating the analytics functions defined in the chart. This is true even if the selected minimum is lower than either the incoming data rate or the finest resolution available from among a set of metrics.

Using a resolution that is finer than the available data will create a time series that has gaps, and these gaps need to be filled using some kind of extrapolation. For example, if the input resolution for a metric is 10s, but the user-selected resolution for a chart 1s, then every 10s there will be an actual value, preceded by 9 timepoints that have no values. We need something to display or evaluate in that chart! To fill those 9 slots, users can select an extrapolation (technically interpolation depending on your choice) policy for the chart:

  • Zero: When no datapoint is available, display and evaluate a zero for each empty slot.
  • Last Value: When no datapoint is available, display and evaluate the last value received for every empty slot.
  • Linear: When no datapoint is available, draw a straight line between the last two datapoints and linearly project that line to the current time, displaying and evaluating the projected datapoint from that line. Warning: although this can generate a more pleasing display, this policy should be used with caution when doing analytics or creating alert detectors.  

Another place where resolutions come into play is when computing moving transformations, like a one-hour moving average or sum of counts over a moving time window. SignalFx dynamically samples the datapoints that will be used in the transformation (at the nearest 1s, 5s, 10s, 30s, 1m or 1h interval that lines up across all metrics) to ensure that we can deliver results quickly and predictably, even for thousands of metrics at once.

Finally, the SignalFx front end chooses a best-fit display resolution based on the actual number of pixels available per chart on the screen. Since there is little value in calculating what to display beyond what your screen can actually hold, doing more than that would only slow visualization without enhancing relevance or actionability. 

SignalFx ensures that you see the right level of detail at the right time. 

data analytics ,log analytics ,resolution

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}