A time series can be viewed as a function of time
Such function can be illustrated as a time-series graph as follows:
The goal of time-series analysis
- Modelling time series: to gain insight into the mechanisms or underlying forces that generate the time series 深入了解产生时间序列的机制或潜在力量
- Forecasting time series: to predict the future values of the time-series variables. 预测时间序列变量的未来值。
Time-series can be characterised by the following four major components:
- Trend 趋势 or long-term movements: a general direction in which a time-series is moving over a long interval of time. The movement can be displayed by a trend curve (dashed line in the above figure) 趋势或长期运动:时间序列在长时间间隔内移动的一般方向。运动可以通过趋势曲线(上图中的虚线)来显示
- Cyclic 周期性运动或周变化 movements or cyclic variations: Long-term oscillations about a trend line, which may or may not be periodic. The cycles need not follow closely similar patterns _and are not of fixed period. _The duration of these fluctuations is usually of more than 2 years, e.g. business or economic cycles. 围绕趋势线的长期振荡,可能是也可能不是周期性的。周期不需要遵循非常相似的模式,也不是固定的周期。这些波动的持续时间通常超过2年,例如商业或经济周期。
- Seasonal 季节性运动或季节性变化 movements or seasonal variations: systematic or calendar related movements, e.g. sudden increase in sales of chocolates and flowers before Valentine’s day. 季节性运动或季节性变化:系统的或与日历相关的运动,如情人节前巧克力和鲜花销量的突然增加。
- Irregular 不规则或随机运动 or random movements: sporadic motion of time series due to random or chance events, such as labor disputes or floods
The trend, cyclic, seasonal, and irregular movements are represented by the variables T, C, S, I, respectively.
Time-series modelling is also referred to as the decomposition of a time series into these four basic movements. The time-series variable can be modelled as either the product of the four variables (i.e.,
) or their sum (i.e.,
). This choice is typically empirical. 经验性选择
Detect Seasonal Patterns by Autocorrelation analysis 通过自相关分析检测季节性模式
Seasonal patterns can be identified by an autocorrelation analysis on two intervals of sequence.
Let [](https://wattlecourses.anu.edu.au/filter/tex/displaytex.php?texexp=y_1%2C%20y_2%2C%20...%2C%20y_N) be the time series.
Compute the correlation coefficient between two subsequences and given time lag . The
is generally defined in advance, say 365 days for example. 两个时间子序列,且k为时间之后。
- Positive coefficient: positive correlation.
- Negative coefficient: negative correlation. If one variable increases then the other decreases.
- Zero coefficient: No correlation between two subsequences
If we find such a correlation, we can use a seasonal index number 季节性指数 to adjust the data for seasonal variation, that is, to remove seasonal fluctuations. We divide the lag period into small chunks
of roughly equal size 我们将滞后期k分成大小相同的ci个小块 (e.g. when
, each
may have length 7 for chunks of each week through the year, or each can be of length 30-ish for each month of the year). We average the targeted variable
over each
period of values and divide that by the average of
the whole period of length
我们对每个时间段的目标变量取平均值,然后除以整个时间段的目标变量y的平均值. This gives us a seasonal index value for each period
因此每个时间段我们都有一个季节性指数. The seasonal effect can be removed from the time series by multiplying each
by its seasonal index value corresponding to the chunk in which it occurs.我们可以通过把yi和季节指数相乘,来消除季节效应的影响。
Detect Trends in the Data: Moving Average of Order n 检测数据趋势:n阶移动平均线
Irregular or unwanted movements in the data can be removed by replacing a value by its moving average value that enables trends to become more apparent.
From a time series [](https://wattlecourses.anu.edu.au/filter/tex/displaytex.php?texexp=y_1%2C%20y_2%2C%20...%2C%20y_N), generate another time series being a sequence of arithmetic means of
consecutive observations 通过原始时间序列生成另外一个时间序列,新序列是连续观测值的算术平均序列。
A moving average tends to reduce the amount of variation present in the data set. Thus the process of replacing the time series by its moving average eliminates unwanted fluctuations (smoothing of time series), such as the 10-day moving average in the diagram above. Note the apparent smoothed but time-delayed effect of fluctuations. 移动平均线倾向于减少数据集中存在的变化量。因此,用移动平均线替换时间序列的过程消除了不必要的波动(时间序列的平滑),如上图中的10天移动平均线。注意波动的明显平滑但延时的效应。
Weighted moving average of order :
A weighted moving average reduces the effect of extreme values and irregular variations, using a vector of weights chosen with greater values in the central elements. 加权移动平均使用在中心元素中选择较大值的权重向量,减少了极值和不规则变化的影响。
Given a sequence of weights , we can compute the weighted moving average of order
as:
Example: Given a sequence of nine values, we can compute its moving average of order 3, and its weighted moving average of order 3 using the weights (1, 4, 1).
Here, the moving average is shown time-shifted in the sequence so that each mean lines up with the middle of the time points over which it averages. Similarly, the weighted moving average value is shown lined up with the central (and most influential) observational value that it averages. This shifting approach tends to reduce the impression of an apparent time-delay in trend detection when compared with the raw observational data, although to do this, an average shown at a point in time is actually using observations made in the future of that time point.
Moving averages
The convention for centrally-weighting the weight vector as shown here is taken from the text and gives you a look-ahead component for each smoothed value, wieghting the current value highly but also taking account of values both before and afterwards in the sequence. Alternatively, it is also common to lag weighted moving averages. Under this method, you would instead use a vector of weights decreasing backwards from the current point. For the example above you could use weightings like (1,1, 4) instead and apply the weight 4 to the current value in the series and weights 1 and 1 to the previous 2 values respectively.
For comparison, a lagging (1,1, 4) weighted moving average in the table above would give the row: (null, null, 3, 3, 3, 4, 7.5, 7, 4).
Action: Watch this video explaining simple moving average and comparing with weighted moving average. Below the video you can find the spreadsheet of data used, with which you might like to experiment with your own variations.
Video example of Moving averages in practice
Spreadsheet of data as used in the video for moving averages
ACTION: If you would like more information on time series analysis in this way, the Australian Bureau of Statistics has a good, brief tutorial: https://www.abs.gov.au/websitedbs/d3310114.nsf/home/time+series+analysis:+the+basics