Chi-square goodness-of-fit test

Perform a chi-square goodness-of-fit test.

Usage

var chi2gof = require( '@stdlib/math/stats/chi2gof' );

chi2gof( x, y[, ...params][, opts] )

For an array or typed array of integers x, a chi-square goodness-of-fit is computed for the null hypothesis that the values of x come from the discrete distribution specified by y. y can be an array of expected frequencies, an array of population probabilities that sum to one, or a string with the name of the discrete distribution to test against. In the latter case, the parameters of the distribution must be supplied as additional arguments after y. The function returns an object holding the calculated test statistic, the p-value of the test, as well as the test decision.

var out;
var x;
var y;

x = [ 89, 37, 30, 28, 2 ];
y = [ 0.40, 0.20, 0.20, 0.15, 0.05 ];

// Supplying probabilities...
out = chi2gof( x, y );
/* returns
    {
        rejected: true,
        alpha: 0.05,
        pValue: 0.018650106520189613,
        df: 3,
        statistic: 9.990143369175627,
        ...
    }
*/

// Supplying expected counts...
x = [ 30, 20, 23, 27 ];
y = [ 25, 25, 25, 25 ];

out = chi2gof( x, y );
/* returns
    {
        rejected: false,
        alpha: 0.05,
        pValue: 0.5087002695252655,
        df: 3,
        statistic: 2.3200000000000003,
        ...
    }
*/

The returned object comes with a .print() method which when invoked will print a formatted output of the results of the hypothesis test.

console.log( out.print() );
/* =>
    Chi-square goodness-of-fit test

    Null hypothesis: population probabilities are equal to those in p

        pValue: 0.0406
        statistic: 9.9901
        degrees of freedom: 4

    Test Decision: Reject null in favor of alternative at 5% significance level
*/

The function accepts the following options:

  • alpha: number in the interval [0,1] giving the significance level of the hypothesis test. Default: 0.05.
  • ddof: "delta degrees of freedom" adjustment. Has to be a nonnegative integer. Default: 0.
  • simulate: boolean indicating whether to compute p-values by Monte Carlo simulation. Default: false.
  • iterations: positive integer specifying the number of Monte Carlo iterations. Default: 500.

By default, the test is carried out at a significance level of 0.05. To choose a custom significance level, set the alpha option.

var table = out.print();
/* e.g., returns

    Chi-square goodness-of-fit test

    Null hypothesis: population probabilities are equal to those in p

        pValue: 0.0406
        statistic: 9.9901
        degrees of freedom: 4

    Test Decision: Reject null in favor of alternative at 5% significance level
*/

out = chi2gof( x, p, {
    'alpha': 0.01
});
table = out.print();
/* e.g., returns

    Chi-square goodness-of-fit test

    Null hypothesis: population probabilities are equal to those in p

        pValue: 0.0406
        statistic: 9.9901
        degrees of freedom: 4

    Test Decision: Fail to reject null in favor of alternative at 1% significance level
*/

By default, the p-value is computed using a chi-square distribution with k - 1 degrees of freedom, where k is the number of levels of x. In case distribution parameters were estimated for the calculation of y, the degrees of freedom have to be corrected. Set the ddof option to use n - ddof - 1 degrees of freedom.

var x = [ 89, 37, 30, 28, 2 ];
var p = [ 0.40, 0.20, 0.20, 0.15, 0.05 ];

var out = chi2gof( x, p, {
    'ddof': 1
});
// returns { 'pValue': ~0.0186, 'statistic': ~9.9901, 'df': 3, ... }

Instead of relying on the chi-square approximation when calculating the p-value, Monte Carlo simulation can be used. To do so, set the simulate option. The simulation is carried out by resampling from the discrete distribution given by y. By default, 500 iterations are used for the simulation. To set a custom number of iterations, use the iterations option.

var x = [ 89, 37, 30, 28, 2 ];
var p = [ 0.40, 0.20, 0.20, 0.15, 0.05 ];

var out = chi2gof( x, p, {
    'simulate': true,
    'iterations': 1000
});
// returns {}

Notes

  • The chi-square approximation may be incorrect if the observed or expected frequencies in each category are too small. It is common to require frequencies greater than five.

Examples

var poisson = require( '@stdlib/random/base/poisson' );
var chi2gof = require( '@stdlib/math/stats/chi2gof' );

var lambda;
var table;
var rpois;
var freqs;
var len;
var out;
var val;
var i;
var x;

// Draw 400 samples from a Poisson( 3.0 ) distribution:
lambda = 3.0;
rpois = poisson.factory( lambda );

len = 400;
x = new Array( len );
for ( i = 0; i < len; i++ ) {
    x[ i ] = rpois();
}

// Generate frequency table:
freqs = [];
for ( i = 0; i < len; i++ ) {
    val = x[ i ];
    if ( freqs[ val ] === void 0 ) {
        freqs[ val ] = 1;
    } else {
        freqs[ val ] += 1;
    }
}

// Fill holes in array:
for ( i = 0; i < freqs.length; i++ ) {
    if ( freqs[ i ] === void 0 ) {
        freqs[ i ] = 0;
    }
}

out = chi2gof( freqs, 'poisson', lambda );
table = out.print();