Anscombe's Quartet
Anscombe's quartet is a set of 4
datasets which all have nearly identical simple statistical properties but vary considerably when graphed. Anscombe created the datasets to demonstrate why graphical data exploration should precede statistical data analysis and to show the effect of outliers on statistical properties.
Usage
var data = require( '@stdlib/datasets/anscombes-quartet' );
data()
Returns Anscombe's quartet, which is comprised of 4
individual datasets, where each individual dataset is an array
of [x,y]
tuples.
var d = data();
/* returns
[
[
[ 10, 8.04 ],
[ 8, 6.95 ],
[ 13, 7.58 ],
[ 9, 8.81 ],
[ 11, 8.33 ],
[ 14, 9.96 ],
[ 6, 7.24 ],
[ 4, 4.26 ],
[ 12, 10.84 ],
[ 7, 4.82 ],
[ 5, 5.68 ]
],
[
[ 10, 9.14 ],
[ 8, 8.14 ],
[ 13, 8.74 ],
[ 9, 8.77 ],
[ 11, 9.26 ],
[ 14, 8.1 ],
[ 6, 6.13 ],
[ 4, 3.1 ],
[ 12, 9.13 ],
[ 7, 7.26 ],
[ 5, 4.74 ]
],
[
[ 10, 7.46 ],
[ 8, 6.77 ],
[ 13, 12.74 ],
[ 9, 7.11 ],
[ 11, 7.81 ],
[ 14, 8.84 ],
[ 6, 6.08 ],
[ 4, 5.39 ],
[ 12, 8.15 ],
[ 7, 6.42 ],
[ 5, 5.73 ]
],
[
[ 8, 6.58 ],
[ 8, 5.76 ],
[ 8, 7.71 ],
[ 8, 8.84 ],
[ 8, 8.47 ],
[ 8, 7.04 ],
[ 8, 5.25 ],
[ 19, 12.5 ],
[ 8, 5.56 ],
[ 8, 7.91 ],
[ 8, 6.89 ]
]
]
*/
Examples
var data = require( '@stdlib/datasets/anscombes-quartet' );
console.log( data() );
CLI
Usage
Usage: anscombes-quartet [options]
Options:
-h, --help Print this message.
-V, --version Print the package version.
Notes
- Data is written to
stdout
as comma-separated values (CSV), where the first line is a header line.
Examples
$ anscombes-quartet
x1,y1,x2,y2,x3,y3,x4,y4
10,8.04,10,9.14,10,7.46,8,6.58
8,6.95,8,8.14,8,6.77,8,5.76
13,7.58,13,8.74,13,12.74,8,7.71
9,8.81,9,8.77,9,7.11,8,8.84
11,8.33,11,9.26,11,7.81,8,8.47
14,9.96,14,8.1,14,8.84,8,7.04
6,7.24,6,6.13,6,6.08,8,5.25
4,4.26,4,3.1,4,5.39,19,12.5
12,10.84,12,9.13,12,8.15,8,5.56
7,4.82,7,7.26,7,6.42,8,7.91
5,5.68,5,4.74,5,5.73,8,6.89
References
- Anscombe, Francis J. 1973. "Graphs in Statistical Analysis." The American Statistician 27 (1). [American Statistical Association, Taylor & Francis, Ltd.]: 17–21. http://www.jstor.org/stable/2682899.
License
The data files (databases) are licensed under an Open Data Commons Public Domain Dedication & License 1.0 and their contents are licensed under Creative Commons Zero v1.0 Universal. The software is licensed under Apache License, Version 2.0.