A Python library for unevenly-spaced time series analysis.

Traces was designed by the team at Datascope based on several practical applications in different domains, because it turns out unevenly-spaced data is actually pretty great, particularly for sensor data analysis .

Traces aims to make it simple to write readable code to:

Taking measurements at irregular intervals is common, but most tools are primarily designed for evenly-spaced measurements. Also, in the real world, time series have missing observations or you may have multiple series with different frequencies: it’s can be useful to model these as unevenly-spaced.

To install traces, run this command in your terminal:

Quickstart: using traces¶

To see a basic use of traces, let’s look at these data from a light switch, also known as Big Data from the Internet of Things.

The main object in traces is a TimeSeries, which you create just like a dictionary, adding the five measurements at 6:00am, 7:45:56am, etc.

>>> time_series = traces . TimeSeries () >>> time_series [ datetime ( 2042 , 2 , 1 , 6 , 0 , 0 )] = 0 # 6:00:00am >>> time_series [ datetime ( 2042 , 2 , 1 , 7 , 45 , 56 )] = 1 # 7:45:56am >>> time_series [ datetime ( 2042 , 2 , 1 , 8 , 51 , 42 )] = 0 # 8:51:42am >>> time_series [ datetime ( 2042 , 2 , 1 , 12 , 3 , 56 )] = 1 # 12:03:56am >>> time_series [ datetime ( 2042 , 2 , 1 , 12 , 7 , 13 )] = 0 # 12:07:13am

What if you want to know if the light was on at 11am? Unlike a python dictionary, you can look up the value at any time even if it’s not one of the measurement times.

>>> time_series [ datetime ( 2042 , 2 , 1 , 11 , 0 , 0 )] # 11:00am 0

The distribution function gives you the fraction of time that the TimeSeries is in each state.

>>> time_series . distribution ( >>> start = datetime ( 2042 , 2 , 1 , 6 , 0 , 0 ), # 6:00am >>> end = datetime ( 2042 , 2 , 1 , 13 , 0 , 0 ) # 1:00pm >>> ) Histogram({0: 0.8355952380952381, 1: 0.16440476190476191})

The light was on about 16% of the time between 6am and 1pm.

Adding more data…¶ Now let’s get a little more complicated and look at the sensor readings from forty lights in a building. How many lights are on throughout the day? The merge function takes the forty individual TimeSeries and efficiently merges them into one TimeSeries where the each value is a list of all lights. >>> trace_list = [ ... list of forty traces . TimeSeries ... ] >>> count = traces . TimeSeries . merge ( trace_list , operation = sum ) We also applied a sum operation to the list of states to get the TimeSeries of the number of lights that are on. How many lights are typically on during business hours, from 8am to 6pm? >>> histogram = count . distribution ( >>> start = datetime ( 2042 , 2 , 1 , 8 , 0 , 0 ), # 8:00am >>> end = datetime ( 2042 , 2 , 1 , 12 + 6 , 0 , 0 ) # 6:00pm >>> ) >>> histogram . median () 17 The distribution function returns a Histogram that can be used to get summary metrics such as the mean or quantiles.