Methods, making heavy use of computers, that are very useful in the analysis of large arrays of observations of correlated variables, such as satellite images of the Earth. Suppose we wish to calculate θ, the expected value of a function h of r correlated random variables. Let X be the r×1 vector of random variables. A standard Monte Carlo method would generate a sequence of numerical values, X1, X2,…, Xn, from which θ could be estimated by θ̂, given byHowever, it can be difficult to generate a vector Xj when the joint distribution of the r variables involves a complex correlation structure—for example, when the variables refer to the values displayed in an array of neighbouring pixels in a satellite image of the Earth's surface.
Let Xjk be the kth random variable in Xj. MCMC methods aim to generate r Markov chains (see Markov process), where the kth chain is the sequence of values x1k, x2k,…. An arbitrary set of values is chosen for the first chain, subsequent chains being generated from their predecessors by Monte Carlo methods. In calculating θ̂ it is necessary to ignore the early chains in the sequence, since these will be affected by the initial choice of values. This is called the burn-in period.
A popular method for obtaining the required Markov chains, which need to be stationary (see Markov process), is the Metropolis–Hastings algorithm. Let the stationary probability for the random variable Xj being in state m be pm and let Q be a known matrix with non-negative elements. The transition probability matrix governing the sequence X1, X2,…is defined by where ajk is given by and each cj is chosen so that ∑kpjk=1.
The most used version of the Metropolis–Hastings algorithm incorporates the Gibbs sampler. The aim is to generate a random vector X with elements satisfying a specified relationship. Denote the initial values in X by x1(0), x2(0),…. A single element of the vector (element j, say) is chosen at random, and a potential new value, x, is generated by random selection from the conditional distribution of Xj given the values of the remaining variables, {Xk, k≠j}. If the new value is in accord with the specified relationship, then it replaces the previous value so that xj(1)=x; otherwise the previous value is retained: xj(1)=xj(0). This process is repeated until approximate equilibrium has been reached.
The procedure was originally proposed in the context of statistical mechanics by Metropolis and others in 1953, and was introduced into Statistics in 1970 by Hastings.