Artificial data created by Anscombe to illustrate the necessity for studying residuals. Each of the four data sets has the same fitted regression line, y=3+0.5x, and the same summary ANOVA table (with the same regression sum of squares, total sum of squares, and value for R2).
x | y | x | y | x | y | x | y |
---|
10 | 8.04 | 10 | 9.14 | 10 | 7.46 | 8 | 6.58 |
8 | 6.95 | 8 | 8.14 | 8 | 6.77 | 8 | 5.76 |
13 | 7.58 | 13 | 8.74 | 13 | 12.74 | 8 | 7.71 |
9 | 8.81 | 9 | 8.77 | 9 | 7.11 | 8 | 8.84 |
11 | 8.33 | 11 | 9.26 | 11 | 7.81 | 8 | 8.47 |
14 | 9.96 | 14 | 8.10 | 14 | 8.84 | 8 | 7.04 |
6 | 7.24 | 6 | 6.13 | 6 | 6.08 | 8 | 5.25 |
4 | 4.26 | 4 | 3.10 | 4 | 5.39 | 19 | 12.50 |
12 | 10.84 | 12 | 9.13 | 12 | 8.15 | 8 | 5.56 |
7 | 4.82 | 7 | 7.26 | 7 | 6.42 | 8 | 7.91 |
5 | 5.68 | 5 | 4.74 | 5 | 5.73 | 8 | 6.89 |