Between two random variables, the correlation is a measure of the extent to which a change in one tends to correspond to a change in the other. The correlation is high or low depending on whether the relationship between the two is close or not. If the change in one corresponds to a change in the other in the same direction, there is positive correlation, and there is a negative correlation if the changes are in opposite directions. Independent random variables have zero correlation. One measure of correlation between the random variables X and Y is the correlation coefficient ρ defined by
(see covariance, variance). This satisfies −1 ≤ ρ ≤ 1. If X and Y are linearly related, then ρ = −1 or + 1.
For a sample of n paired observations (x1, y1), (x2, y2),…(xn, yn), the (sample) correlation coefficient is equal to
Note that the existence of some correlation between two variables need not imply causation.