“model selection procedure”的概念、定义、翻译、参考文献-科学参考

model selection procedure

Statistics

A procedure for choosing between competing models that is based on balancing model complexity against the quality of that model’s fit to the given data.
For a multiple regression model, one approach makes use of the Mallows C_p statistic, introduced by Mallows in 1964. With n observations and k explanatory variables (see regression), define s² as the estimate of the experimental error variance. Then, for a model using just p of the k variables, $model selection procedure$ where y_j is an observation and is the corresponding fitted value. A model that fits well should have a C_p value close to p. An acceptable fit is provided by a model for which $model selection procedure$
where a=k−p+1, b=n−k−1, and F_{a, b} (α) is the value exceeded by chance on 100α% of occasions by a random variable having an F-distribution with a and b degrees of freedom. Typically, α=0.05 or 0.01.
A more generally applicable alternative is based on AIC (Akaike's information criterion) proposed by Akaike in 1969. For categorical data this amounts to choosing the model that minimizes G²−2ν, where G² is the likelihood-ratio goodness-of-fit statistic and ν is the number of degrees of freedom associated with the model. If the Bayesian information criterion (BIC) (also called the Schwarz criterion) is used, then the quantity minimized is G²−ν ln n, where ln is the natural logarithm and n is the sample size. This usually results in the selection of a simpler model. A third alternative of this type is the Hannan-Quinn criterion, for which the quantity to be minimized is G²−2ν ln(ln n).
Whatever procedure is used for model selection, it is usually the case that the model fits less well (as measured by R², the coefficient of determination, see ANOVA) when it is applied to new data. The reduction in fit is described as shrinkage. See also stepwise procedure.

单词	model selection procedure
释义	model selection procedure Statistics A procedure for choosing between competing models that is based on balancing model complexity against the quality of that model’s fit to the given data. For a multiple regression model, one approach makes use of the Mallows C_p statistic, introduced by Mallows in 1964. With n observations and k explanatory variables (see regression), define s² as the estimate of the experimental error variance. Then, for a model using just p of the k variables, $model selection procedure$ where y_j is an observation and is the corresponding fitted value. A model that fits well should have a C_p value close to p. An acceptable fit is provided by a model for which $model selection procedure$ where a=k−p+1, b=n−k−1, and F_{a, b} (α) is the value exceeded by chance on 100α% of occasions by a random variable having an F-distribution with a and b degrees of freedom. Typically, α=0.05 or 0.01. A more generally applicable alternative is based on AIC (Akaike's information criterion) proposed by Akaike in 1969. For categorical data this amounts to choosing the model that minimizes G²−2ν, where G² is the likelihood-ratio goodness-of-fit statistic and ν is the number of degrees of freedom associated with the model. If the Bayesian information criterion (BIC) (also called the Schwarz criterion) is used, then the quantity minimized is G²−ν ln n, where ln is the natural logarithm and n is the sample size. This usually results in the selection of a simpler model. A third alternative of this type is the Hannan-Quinn criterion, for which the quantity to be minimized is G²−2ν ln(ln n). Whatever procedure is used for model selection, it is usually the case that the model fits less well (as measured by R², the coefficient of determination, see ANOVA) when it is applied to new data. The reduction in fit is described as shrinkage. See also stepwise procedure.
随便看	human error human evolution Human Genome Project human geography human growth hormone human immunodeficiency virus human influence index humanism humanity, principle of human nature Human Suffering Index human–machine interface Humason, Milton La Salle Humason, Milton Lasell (1891–1972) Humayun (1508–56) Humberian orogeny Humboldt current Humboldt, Friedrich Heinrich Alexander von (1769–1859) Humboldt, Wilhelm von (1767–1835) humectant humectants Hume, David (1711–76) Hume, John (1937–2020) Hume-Rothery electron concentration rules Hume-Rothery rules