“cluster analysis”的概念、定义、翻译、参考文献-科学参考

cluster analysis

Mathematics

The task of assigning objects to recognizable groups called clusters, according to various measurements. These clusters commonly show correlation between different attributes. The notion of a cluster cannot be precisely given, and many different algorithms are hence used in cluster analysis.

Statistics

A method for identifying data items that closely resemble one another, assembling them into clusters. A number of characteristics are measured for each of several items (which might be, for example, people, plants, machines, etc.). The process of formation of the clusters is often represented using a dendrogram. The most commonly used methods are the agglomerative clustering methods.

Computer

Any statistical technique for grouping a set of units into clusters of similar units on the basis of observed qualitative and/or quantitative measurements, usually on several variables. Cluster analysis aims to fulfil simultaneously the conditions that units in the same cluster should be similar, and that units in different clusters should be dissimilar. It is not usually possible to satisfy both conditions fully, and no single method can be recommended as best for all sets of data. Among other desirable properties of clusters are that some variables should be constant for all units within a cluster, which makes it possible to provide a simple scheme for identification of units in terms of clusters.
Most cluster analysis methods require a similarity or distance measure to be defined between each pair of units, so that the units similar to a given unit may be identified. Similarity measures have been proposed for both quantitative (continuous) variables and qualitative (discrete) variables, using a weighted mean of similarity scores over all variables considered. The term distance comes from a geometric representation of data as points in multidimensional space: small distances correspond to large similarities.
Hierarchical cluster analysis methods form clusters in sequence, either by amalgamation of units into clusters and clusters into larger clusters, or by subdivision of clusters into smaller clusters and single units. Whichever direction is chosen, the results can be represented by a dendrogram or family tree in which the units at one level are nested within units at all higher levels.
Nonhierarchical cluster analysis methods allocate units to a fixed number of clusters so as to optimize some criterion representing a desired property of clusters. Such methods may be iterative, involving transfer of units between clusters until no further improvement can be achieved. The solution for a given number of clusters need bear little relation to the solution for a larger or smaller number.
Cluster analysis is often used in conjunction with other methods of multivariate analysis to describe the structure of a complex set of data.

Electronics and Electrical Engineering

Techniques for grouping a collection of entities into clusters, so that entities in the same cluster are more similar to each other on some dimension of interest than they are to those in different clusters. Cluster analysis is often used for initial exploration of large datasets in order to determine the most appropriate notion of similarity or closeness for use in further analysis.

Geology and Earth Sciences

In statistics, the classification of observations into subsets based on a criterion of similarity.

Geography

The assignment of a set of objects into groups so that the objects in the same cluster are more similar (in some sense or another) to each other than to those in other clusters. Cluster analysis is used when the researcher does not know the number of groups in advance but wishes to establish groups and then analyse group membership. For example, if the term ‘geomorphological processes’ is entered onto a search engine, there will be nearly 400000 results. Careful study reveals that these could be clustered into (at least): fluvial, aeolian, hillslope, glacial, tectonic, igneous, and biological processes. Kent (2006) PPG 30, 3 reviews recent changing patterns in the use of cluster analysis, and Mohseni Saravi et al. (2010, PPG 34, 2) use cluster analysis to delimit homogenous hydrological regions. For similarity analysis/minimum variance, see Ward (1963) Am. Stat. Ass. J. 5; for unweighted pairs group average, see Williams et al. (1966) J. Ecol. 54. For ordination techniques/detrenched correspondence analysis, see M. O. Hill (1979); for non-metric multidimensional scaling, see T. Cox and M. Cox (2000). CANOCO software does most multivariate techniques.

Economics

The general name for a number of different methods for grouping objects that have similar characteristics into sets or ‘clusters’. Cluster analysis is used to explore data by sorting different objects into sets so that the degree of association between two objects is maximal if they belong to the same set. It can be used to discover structures in data but provides no explanation for the structure.

单词	cluster analysis
释义	cluster analysis Mathematics The task of assigning objects to recognizable groups called clusters, according to various measurements. These clusters commonly show correlation between different attributes. The notion of a cluster cannot be precisely given, and many different algorithms are hence used in cluster analysis. Statistics A method for identifying data items that closely resemble one another, assembling them into clusters. A number of characteristics are measured for each of several items (which might be, for example, people, plants, machines, etc.). The process of formation of the clusters is often represented using a dendrogram. The most commonly used methods are the agglomerative clustering methods. Computer Any statistical technique for grouping a set of units into clusters of similar units on the basis of observed qualitative and/or quantitative measurements, usually on several variables. Cluster analysis aims to fulfil simultaneously the conditions that units in the same cluster should be similar, and that units in different clusters should be dissimilar. It is not usually possible to satisfy both conditions fully, and no single method can be recommended as best for all sets of data. Among other desirable properties of clusters are that some variables should be constant for all units within a cluster, which makes it possible to provide a simple scheme for identification of units in terms of clusters. Most cluster analysis methods require a similarity or distance measure to be defined between each pair of units, so that the units similar to a given unit may be identified. Similarity measures have been proposed for both quantitative (continuous) variables and qualitative (discrete) variables, using a weighted mean of similarity scores over all variables considered. The term distance comes from a geometric representation of data as points in multidimensional space: small distances correspond to large similarities. Hierarchical cluster analysis methods form clusters in sequence, either by amalgamation of units into clusters and clusters into larger clusters, or by subdivision of clusters into smaller clusters and single units. Whichever direction is chosen, the results can be represented by a dendrogram or family tree in which the units at one level are nested within units at all higher levels. Nonhierarchical cluster analysis methods allocate units to a fixed number of clusters so as to optimize some criterion representing a desired property of clusters. Such methods may be iterative, involving transfer of units between clusters until no further improvement can be achieved. The solution for a given number of clusters need bear little relation to the solution for a larger or smaller number. Cluster analysis is often used in conjunction with other methods of multivariate analysis to describe the structure of a complex set of data. Electronics and Electrical Engineering Techniques for grouping a collection of entities into clusters, so that entities in the same cluster are more similar to each other on some dimension of interest than they are to those in different clusters. Cluster analysis is often used for initial exploration of large datasets in order to determine the most appropriate notion of similarity or closeness for use in further analysis. Geology and Earth Sciences In statistics, the classification of observations into subsets based on a criterion of similarity. Geography The assignment of a set of objects into groups so that the objects in the same cluster are more similar (in some sense or another) to each other than to those in other clusters. Cluster analysis is used when the researcher does not know the number of groups in advance but wishes to establish groups and then analyse group membership. For example, if the term ‘geomorphological processes’ is entered onto a search engine, there will be nearly 400000 results. Careful study reveals that these could be clustered into (at least): fluvial, aeolian, hillslope, glacial, tectonic, igneous, and biological processes. Kent (2006) PPG 30, 3 reviews recent changing patterns in the use of cluster analysis, and Mohseni Saravi et al. (2010, PPG 34, 2) use cluster analysis to delimit homogenous hydrological regions. For similarity analysis/minimum variance, see Ward (1963) Am. Stat. Ass. J. 5; for unweighted pairs group average, see Williams et al. (1966) J. Ecol. 54. For ordination techniques/detrenched correspondence analysis, see M. O. Hill (1979); for non-metric multidimensional scaling, see T. Cox and M. Cox (2000). CANOCO software does most multivariate techniques. Economics The general name for a number of different methods for grouping objects that have similar characteristics into sets or ‘clusters’. Cluster analysis is used to explore data by sorting different objects into sets so that the degree of association between two objects is maximal if they belong to the same set. It can be used to discover structures in data but provides no explanation for the structure.
随便看	anatase anatexis Anatolepis heintzi anatomy anattavada Anaxagoras (c.500–c.428 bc) Anaxagoras of Clazomenae Anaxagoras of Clazomenae (499–428) Anaxarchus (4th) Anaximander (c.610–c.540 bc) Anaximander of Miletus Anaximander of Miletus (610–c.547) Anaximenes of Miletus Anaximenes of Miletus (546) ancestor((of a node in a *tree)) ancestral relation ancestral trait anchialine anchimeric assistance anchimetamorphism anchor anchor image anchor ring anchor tenant ancien régime