Anscombe's Quartet

Anscombe's quartet.

Anscombe's quartet is a set of 4 datasets which all have nearly identical simple statistical properties but vary considerably when graphed. Anscombe created the datasets to demonstrate why graphical data exploration should precede statistical data analysis and to show the effect of outliers on statistical properties.

Usage

var data = require( '@stdlib/datasets/anscombes-quartet' );

data()

Returns Anscombe's quartet, which is comprised of 4 individual datasets, where each individual dataset is an array of [x,y] tuples.

var d = data();
/* returns
    [
        [
            [ 10, 8.04 ],
            [ 8, 6.95 ],
            [ 13, 7.58 ],
            [ 9, 8.81 ],
            [ 11, 8.33 ],
            [ 14, 9.96 ],
            [ 6, 7.24 ],
            [ 4, 4.26 ],
            [ 12, 10.84 ],
            [ 7, 4.82 ],
            [ 5, 5.68 ]
        ],
        [
            [ 10, 9.14 ],
            [ 8, 8.14 ],
            [ 13, 8.74 ],
            [ 9, 8.77 ],
            [ 11, 9.26 ],
            [ 14, 8.1 ],
            [ 6, 6.13 ],
            [ 4, 3.1 ],
            [ 12, 9.13 ],
            [ 7, 7.26 ],
            [ 5, 4.74 ]
        ],
        [
            [ 10, 7.46 ],
            [ 8, 6.77 ],
            [ 13, 12.74 ],
            [ 9, 7.11 ],
            [ 11, 7.81 ],
            [ 14, 8.84 ],
            [ 6, 6.08 ],
            [ 4, 5.39 ],
            [ 12, 8.15 ],
            [ 7, 6.42 ],
            [ 5, 5.73 ]
        ],
        [
            [ 8, 6.58 ],
            [ 8, 5.76 ],
            [ 8, 7.71 ],
            [ 8, 8.84 ],
            [ 8, 8.47 ],
            [ 8, 7.04 ],
            [ 8, 5.25 ],
            [ 19, 12.5 ],
            [ 8, 5.56 ],
            [ 8, 7.91 ],
            [ 8, 6.89 ]
        ]
    ]
*/

Examples

var data = require( '@stdlib/datasets/anscombes-quartet' );

console.log( data() );

CLI

Usage

Usage: anscombes-quartet [options]

Options:

  -h,    --help                Print this message.
  -V,    --version             Print the package version.

Notes

  • Data is written to stdout as comma-separated values (CSV), where the first line is a header line.

Examples

$ anscombes-quartet
x1,y1,x2,y2,x3,y3,x4,y4
10,8.04,10,9.14,10,7.46,8,6.58
8,6.95,8,8.14,8,6.77,8,5.76
13,7.58,13,8.74,13,12.74,8,7.71
9,8.81,9,8.77,9,7.11,8,8.84
11,8.33,11,9.26,11,7.81,8,8.47
14,9.96,14,8.1,14,8.84,8,7.04
6,7.24,6,6.13,6,6.08,8,5.25
4,4.26,4,3.1,4,5.39,19,12.5
12,10.84,12,9.13,12,8.15,8,5.56
7,4.82,7,7.26,7,6.42,8,7.91
5,5.68,5,4.74,5,5.73,8,6.89

References

  • Anscombe, Francis J. 1973. "Graphs in Statistical Analysis." The American Statistician 27 (1). [American Statistical Association, Taylor & Francis, Ltd.]: 17–21. http://www.jstor.org/stable/2682899.

License

The data files (databases) are licensed under an Open Data Commons Public Domain Dedication & License 1.0 and their contents are licensed under Creative Commons Zero v1.0 Universal. The software is licensed under Apache License, Version 2.0.

Did you find this page helpful?