PDL-Stats statistics modules in Perl Data Language

PDL::Stats::TS

  • NAME
  • DESCRIPTION
  • SYNOPSIS
  • FUNCTIONS
  • METHODS
  • REFERENCES

    NAME

    PDL::Stats::TS -- basic time series functions

    DESCRIPTION

    The terms FUNCTIONS and METHODS are arbitrarily used to refer to methods that are threadable and methods that are NOT threadable, respectively. Plots require PDL::Graphics::PGPLOT.

    ***EXPERIMENTAL!*** In particular, bad value support is spotty and may be shaky. USE WITH DISCRETION!

    SYNOPSIS

        use PDL::LiteF;
        use PDL::NiceSlice;
        use PDL::Stats::TS;
    
        my $r = $data->acf(5);

    FUNCTIONS

    acf

      Signature: (x(t); int h(); [o]r(h+1))

    Autocorrelation function for up to lag h. If h is not specified it's set to t-1 by default.

    acf does not process bad values.

    usage:

        perldl> $a = sequence 10
    
        # lags 0 .. 5
    
        perldl> p $a->acf(5)
        [1 0.7 0.41212121 0.14848485 -0.078787879 -0.25757576]

    acvf

      Signature: (x(t); int h(); [o]v(h+1))

    Autocovariance function for up to lag h. If h is not specified it's set to t-1 by default.

    acvf does not process bad values.

    usage:

        perldl> $a = sequence 10
    
        # lags 0 .. 5
    
        perldl> p $a->acvf(5)
        [82.5 57.75 34 12.25 -6.5 -21.25]
    
        # autocorrelation
        
        perldl> p $a->acvf(5) / $a->acvf(0)
        [1 0.7 0.41212121 0.14848485 -0.078787879 -0.25757576]

    diff

      Signature: (x(t); [o]dx(t))

    Differencing. DX(t) = X(t) - X(t-1), DX(0) = X(0). Can be done inplace.

    diff does not process bad values. It will set the bad-value flag of all output piddles if the flag is set for any of the input piddles.

    inte

      Signature: (x(n); [o]ix(n))

    Integration. Opposite of differencing. IX(t) = X(t) + X(t-1), IX(0) = X(0). Can be done inplace.

    inte does not process bad values. It will set the bad-value flag of all output piddles if the flag is set for any of the input piddles.

    dseason

      Signature: (x(t); int d(); [o]xd(t))

    Deseasonalize data using moving average filter the size of period d.

    dseason does handle bad values. It will set the bad-value flag of all output piddles if the flag is set for any of the input piddles.

    fill_ma

      Signature: (x(t); int q(); [o]xf(t))

    Fill missing value with moving average. xf(t) = sum(x(t-q .. t-1, t+1 .. t+q)) / 2q.

    fill_ma does handle bad values. Output pdl bad flag is cleared unless the specified window size q is too small and there are still bad values.

      my $x_filled = $x->fill_ma( $q );

    filter_exp

      Signature: (x(t); a(); [o]xf(t))

    Filter, exponential smoothing. xf(t) = a * x(t) + (1-a) * xf(t-1)

    filter_exp does not process bad values. It will set the bad-value flag of all output piddles if the flag is set for any of the input piddles.

    filter_ma

      Signature: (x(t); int q(); [o]xf(t))

    Filter, moving average. xf(t) = sum(x(t-q .. t+q)) / (2q + 1)

    filter_ma does not process bad values. It will set the bad-value flag of all output piddles if the flag is set for any of the input piddles.

    mae

      Signature: (a(n); b(n); float+ [o]c())

    Mean absolute error. MAE = 1/n * sum( abs(y - y_pred) )

    Usage:

        $mae = $y->mae( $y_pred );

    mae does handle bad values. It will set the bad-value flag of all output piddles if the flag is set for any of the input piddles.

    mape

      Signature: (a(n); b(n); float+ [o]c())

    Mean absolute percent error. MAPE = 1/n * sum(abs((y - y_pred) / y))

    Usage:

        $mape = $y->mape( $y_pred );

    mape does handle bad values. It will set the bad-value flag of all output piddles if the flag is set for any of the input piddles.

    wmape

      Signature: (a(n); b(n); float+ [o]c())

    Weighted mean absolute percent error. avg(abs(error)) / avg(abs(data)). Much more robust compared to mape with division by zero error (cf. Schütz, W., & Kolassa, 2006).

    Usage:

        $wmape = $y->wmape( $y_pred );

    wmape does handle bad values. It will set the bad-value flag of all output piddles if the flag is set for any of the input piddles.

    portmanteau

      Signature: (r(h); longlong t(); [o]Q())

    Portmanteau significance test (Ljung-Box) for autocorrelations.

    Usage:

        perldl> $a = sequence 10
    
        # acf for lags 0-5
        # lag 0 excluded from portmanteau
        
        perldl> p $chisq = $a->acf(5)->portmanteau( $a->nelem )
        11.1753902662994
       
        # get p-value from chisq distr
    
        perldl> use PDL::GSL::CDF
        perldl> p 1 - gsl_cdf_chisq_P( $chisq, 5 )
        0.0480112934306748

    portmanteau does not process bad values. It will set the bad-value flag of all output piddles if the flag is set for any of the input piddles.

    pred_ar

      Signature: (x(d); b(p|p+1); int t(); [o]pred(t))

    Calculates predicted values up to period t (extend current series up to period t) for autoregressive series, with or without constant. If there is constant, it is the last element in b, as would be returned by ols or ols_t.

    pred_ar does not process bad values.

      CONST  => 1,

    Usage:

        perldl> $x = sequence 2
    
          # last element is constant
        perldl> $b = pdl(.8, -.2, .3)
    
        perldl> p $x->pred_ar($b, 7)
        [0       1     1.1    0.74   0.492  0.3656 0.31408]
     
          # no constant
        perldl> p $x->pred_ar($b(0:1), 7, {const=>0})
        [0       1     0.8    0.44   0.192  0.0656 0.01408]

    season_m

    Given length of season, returns seasonal mean and var for each period (returns seasonal mean only in scalar context).

    Default options (case insensitive):

        START_POSITION => 0,     # series starts at this position in season
        MISSING        => -999,  # internal mark for missing points in season
        PLOT  => 1,              # boolean
          # see PDL::Graphics::PGPLOT::Window for next options
        WIN   => undef,          # pass pgwin object for more plotting control
        DEV   => '/xs',          # open and close dev for plotting if no WIN
                                 # defaults to '/png' in Windows
        COLOR => 1,

    See PDL::Graphics::PGPLOT for detailed graphing options.

        my ($m, $ms) = $data->season_m( 24, { START_POSITION=>2 } );

    plot_dseason

    Plots deseasonalized data and original data points. Opens and closes default window for plotting unless a pgwin object is passed in options. Returns deseasonalized data.

    Default options (case insensitive):

        WIN   => undef,
        DEV   => '/xs',    # open and close dev for plotting if no WIN
                           # defaults to '/png' in Windows
        COLOR => 1,        # data point color

    See PDL::Graphics::PGPLOT for detailed graphing options.

    METHODS

    plot_acf

    Plots and returns autocorrelations for a time series.

    Default options (case insensitive):

        SIG  => 0.05,      # can specify .10, .05, .01, or .001
        DEV  => '/xs',     # open and close dev for plotting
                           # defaults to '/png' in Windows

    Usage:

        perldl> $a = sequence 10
        
        perldl> p $r = $a->plot_acf(5)
        [1 0.7 0.41212121 0.14848485 -0.078787879 -0.25757576]

    REFERENCES

    Brockwell, P.J., & Davis, R.A. (2002). Introcution to Time Series and Forecasting (2nd ed.). New York, NY: Springer.

    Schütz, W., & Kolassa, S. (2006). Foresight: advantages of the MAD/Mean ratio over the MAPE. Retrieved Jan 28, 2010, from http://www.saf-ag.com/226+M5965d28cd19.html