Predicting a Number From a Time Series
carl wrote:
I have a time series like this:
[....
Is there a a good way to predict what the next number is the time series
will be and what the standard deviation is ?
Hopefully one of the statistics experts will comment. But looking at
some of the responses so far, I want to offer some caveats for your
consideration.
First, when trying to extrapolate data, it is wise to start by graphing
the data and looking at its behavior subjectively, not by blindly
manipulating the data numerically as if one approach fits all. (It
doesn't.) It is also wise to factor in any information that you might
know about the nature of the data; for example, if the data represents
month-to-month sales variation, you might know that in that industry,
there is a cyclical or seasonal tendency to the data.
Use the Chart Wizard to graph this data and create a linear trendline.
I mention a linear trendline only because that is what the FORECAST()
function does. I see an oscillating pattern around the trendline. One
interpretation might be that there is sharp dampening effect over time
which converges to the trendline. If that is your interpretation based
on your knowledge of the nature of the data (e.g. an exponential
decay), you might reasonably assume that the linear trendline does
indeed predict the extrapolation of the dampened data, and FORECAST()
is the proper tool to use. On the other hand, the oscillating behavior
might convince you that the linear trendline is inadequate for
extrapolating the data because there really is no dampening effect, but
merely a random (or cyclical) pattern of the data, perhaps centered
around the linear trendline; then again, perhaps the linear trendline
is an illusion. Again, that would depend on your knowledge of the
intrinsic nature of the data.
Second, be careful with how you might use the std dev of this data. If
you are using it simply as a measure of the deviation of the data
around the mean, that might be okay. It is a common statistic for that
purpose, albeit not the only one and not always the most appropriate
one to use. But you might have heard things like 68% of the predicted
data will lie within 1 std dev of the mean. That is valid if the data
is "normally distributed". But that does not seem to be true in this
case, unless once again you have knowledge of the intrinsic nature of
the data that suggests that it should be normally distributed -- in
which case, you might conclude this sampling simply does not accurately
represent the behavior of the population.
Finally, there is the matter of how the std dev (and the average) is
computed.
First, you need to decide if you want to treat this data truly as a
"time series" or as a collection of periodic data. For example, the
population every 10 years is a collection of periodic data. You might
compute the arithmetic average and perhaps the arithmetic std dev
(STDEV). But the rate of growth of the population every 10 years is a
"time series". You might compute the geometric average and perhaps the
geometric std dev.
If you want to consider this data to be a "time series", the geometric
mean computes the average percentage change per period. The geometric
mean can be computed by GEOMEAN(1+A1:A20)-1, entered as an array
formula (ctrl-shift-Enter). Similarly, I believe the geometric std dev
should be used. That is computed by EXP(STDEV(LN(1+A1:A20)))-1,
entered as an array formula. FYI, the geometric mean could also be
computed using the array formula EXP(AVERAGE(LN(1+A1:A20)))-1.
This is probably not the cookie-cutter solution that you were hoping
for. But I hope it gives you some things to think about before
crunching the numbers.
----- complete previous posting ----
carl wrote:
I have a time series like this:
.85%
5.42%
7.90%
8.28%
3.83%
3.47%
6.03%
4.19%
6.71%
8.74%
6.03%
7.71%
8.75%
7.01%
2.87%
5.23%
4.89%
6.33%
6.50%
6.22%
Is there a a good way to predict what the next number is the time series
will be and what the standard deviation is ?
Thank you in advance.
|