A technique, originally developed by R. A. Fisher, whereby the total variation in a vector of numbers y1 … yn, expressed as the sum of squares about the mean
is split up into component sums of squares ascribable to the effects of various classifying factors that may index the data. Thus if the data consist of a two-way
m×
n array, classified by factors A and B and indexed by
then the analysis of variance gives the identity
Geometrically the analysis of variance becomes the successive projections of the vector y, considered as a point in n-dimensional space, onto orthogonal hyperplanes within that space. The dimensions of the hyperplanes give the degrees of freedom for each term; in the above example these are
A statistical model applied to the data allows mean squares, equal to (sum of squares)/(degrees of freedom), to be compared with an error mean square that measures the background ‘noise’. Large mean squares indicate large effects of the factors concerned. The above processes can be much elaborated (see experimental design, regression analysis).