Co-Clustering : Models, Algorithms and Applications.
Material type: TextSeries: Computer Engineering SeriesPublisher: Somerset : John Wiley & Sons, Incorporated, 2013Copyright date: ©2013Edition: 1st edDescription: 1 online resource (252 pages)Content type: text Media type: computer Carrier type: online resourceISBN: 9781118649497Subject(s): Cluster analysisGenre/Form: Electronic books.Additional physical formats: Print version:: Co-Clustering : Models, Algorithms and ApplicationsDDC classification: 519.53 LOC classification: QA278.G683 2014ebOnline resources: Click to ViewCover -- Title page -- Table of Contents -- Acknowledgment -- Introduction -- I.1. Types and representation of data -- I.1.1. Binary data -- I.1.2. Categorical data -- I.1.3. Continuous data -- I.1.4. Contingency table -- I.1.5. Data representations -- I.2. Simultaneous analysis -- I.2.1. Data analysis -- I.2.2. Co-clustering -- I.2.3. Applications -- I.3. Notation -- I.4. Different approaches -- I.4.1. Two-mode partitioning -- I.4.2. Two-mode hierarchical clustering -- I.4.3. Direct or block clustering -- I.4.4. Biclustering -- I.4.5. Other structures and other aims -- I.5. Model-based co-clustering -- I.6. Outline -- Chapter 1. Cluster Analysis -- 1.1. Introduction -- 1.2. Miscellaneous clustering methods -- 1.2.1. Hierarchical approach -- 1.2.2. The k-means algorithm -- 1.2.3. Other approaches -- 1.3. Model-based clustering and the mixture model -- 1.4. EM algorithm -- 1.4.1. Complete data and complete-data likelihood -- 1.4.2. Principle -- 1.4.3. Application to mixture models -- 1.4.4. Properties -- 1.4.5. EM: an alternating optimization algorithm -- 1.5. Clustering and the mixture model -- 1.5.1. The two approaches -- 1.5.2. Classification likelihood -- 1.5.3. The CEM algorithm -- 1.5.4. Comparison of the two approaches -- 1.5.5. Fuzzy clustering -- 1.6. Gaussian mixture model -- 1.6.1. The model -- 1.6.2. CEM algorithm -- 1.6.3. Spherical form, identical proportions and volumes -- 1.6.4. Spherical form, identical proportions but differing volumes -- 1.6.5. Identical covariance matrices and proportions -- 1.7. Binary data -- 1.7.1. Binary mixture model -- 1.7.2. Parsimonious model -- 1.7.3. Examples of application -- 1.8. Categorical variables -- 1.8.1. Multinomial mixture model -- 1.8.2. Parsimonious model -- 1.9. Contingency tables -- 1.9.1. MNDKI2 algorithm -- 1.9.2. Model-based approach -- 1.9.3. Illustration -- 1.10. Implementation.
1.10.1. Choice of model and of the number of classes -- 1.10.2. Strategies for use -- 1.10.3. Extension to particular situations -- 1.11. Conclusion -- Chapter 2. Model-Based Co-Clustering -- 2.1. Metric approach -- 2.2. Probabilistic models -- 2.3. Latent block model -- 2.3.1. Definition -- 2.3.2. Link with the mixture model -- 2.3.3. Log-likelihoods -- 2.3.4. A complex model -- 2.4. Maximum likelihood estimation and algorithms -- 2.4.1. Variational EM approach -- 2.4.2. Classification EM approach -- 2.4.3. Stochastic EM-Gibbs approach -- 2.5. Bayesian approach -- 2.6. Conclusion and miscellaneous developments -- Chapter 3. Co-Clustering of Binary and Categorical Data -- 3.1. Example and notation -- 3.2. Metric approach -- 3.3. Bernoulli latent block model and algorithms -- 3.3.1. The model -- 3.3.2. Model identifiability -- 3.3.3. Binary LBVEM and LBCEM algorithms -- 3.4. Parsimonious Bernoulli LBMs -- 3.5. Categorical data -- 3.6. Bayesian inference -- 3.7. Model selection -- 3.7.1. The integrated completed log-likelihood (ICL) -- 3.7.2. Penalized information criteria -- 3.8. Illustrative experiments -- 3.8.1. Townships -- 3.8.2. Mero -- 3.9. Conclusion -- Chapter 4. Co-Clustering of Contingency Tables -- 4.1. Measures of association -- 4.1.1. Phi-squared coefficient -- 4.1.2. Mutual information -- 4.2. Contingency table associated with a couple of partitions -- 4.2.1. Associated distributions -- 4.2.2. Associated measures of association -- 4.3. Co-clustering of contingency table -- 4.3.1. Two equivalent approaches -- 4.3.2. Parameter modification of criteria -- 4.3.3. Co-clustering with the phi-squared coefficient -- 4.3.4. Co-clustering with the mutual information -- 4.4. Model-based co-clustering -- 4.4.1. Block model for contingency tables -- 4.4.2. Poisson latent block model -- 4.4.3. Poisson LBVEM and LBCEM algorithms.
4.5. Comparison of all algorithms -- 4.5.1. CROKI2 versus CROINFO -- 4.5.2. CROINFO versus Poisson LBCEM -- 4.5.3. Poisson LBVEM versus Poisson LBCEM -- 4.5.4. Behavior of CROKI2, CROINFO, LBCEM and LBVEM -- 4.6. Conclusion -- Chapter 5. Co-Clustering of Continuous Data -- 5.1. Metric approach -- 5.1.1. Measure of information -- 5.1.2. Summarized data associated with partitions -- 5.1.3. Objective function -- 5.1.4. CROEUC algorithm -- 5.2. Gaussian latent block model -- 5.2.1. The model -- 5.2.2. Gaussian LBVEM and LBCEM algorithms -- 5.2.3. Parsimonious Gaussian latent block models -- 5.3. Illustrative example -- 5.4. Gaussian block mixture model -- 5.4.1. The model -- 5.4.2. GBEM algorithm -- 5.5. Numerical experiments -- 5.5.1. GBEM versus CROEUC and EM -- 5.5.2. Effect of the size of data -- 5.6. Conclusion -- Bibliography -- Index.
Cluster or co-cluster analyses are important tools in a variety of scientific areas. The introduction of this book presents a state of the art of already well-established, as well as more recent methods of co-clustering. The authors mainly deal with the two-mode partitioning under different approaches, but pay particular attention to a probabilistic approach. Chapter 1 concerns clustering in general and the model-based clustering in particular. The authors briefly review the classical clustering methods and focus on the mixture model. They present and discuss the use of different mixtures adapted to different types of data. The algorithms used are described and related works with different classical methods are presented and commented upon. This chapter is useful in tackling the problem of co-clustering under the mixture approach. Chapter 2 is devoted to the latent block model proposed in the mixture approach context. The authors discuss this model in detail and present its interest regarding co-clustering. Various algorithms are presented in a general context. Chapter 3 focuses on binary and categorical data. It presents, in detail, the appropriated latent block mixture models. Variants of these models and algorithms are presented and illustrated using examples. Chapter 4 focuses on contingency data. Mutual information, phi-squared and model-based co-clustering are studied. Models, algorithms and connections among different approaches are described and illustrated. Chapter 5 presents the case of continuous data. In the same way, the different approaches used in the previous chapters are extended to this situation. Contents 1. Cluster Analysis. 2. Model-Based Co-Clustering. 3. Co-Clustering of Binary and Categorical Data. 4. Co-Clustering of Contingency Tables. 5. Co-Clustering of Continuous Data. About the Authors Gérard Govaert is
Professor at the University of Technology of Compiègne, France. He is also a member of the CNRS Laboratory Heudiasyc (Heuristic and diagnostic of complex systems). His research interests include latent structure modeling, model selection, model-based cluster analysis, block clustering and statistical pattern recognition. He is one of the authors of the MIXMOD (MIXtureMODelling) software. Mohamed Nadif is Professor at the University of Paris-Descartes, France, where he is a member of LIPADE (Paris Descartes computer science laboratory) in the Mathematics and Computer Science department. His research interests include machine learning, data mining, model-based cluster analysis, co-clustering, factorization and data analysis. Cluster Analysis is an important tool in a variety of scientific areas. Chapter 1 briefly presents a state of the art of already well-established as well more recent methods. The hierarchical, partitioning and fuzzy approaches will be discussed amongst others. The authors review the difficulty of these classical methods in tackling the high dimensionality, sparsity and scalability. Chapter 2 discusses the interests of coclustering, presenting different approaches and defining a co-cluster. The authors focus on co-clustering as a simultaneous clustering and discuss the cases of binary, continuous and co-occurrence data. The criteria and algorithms are described and illustrated on simulated and real data. Chapter 3 considers co-clustering as a model-based co-clustering. A latent block model is defined for different kinds of data. The estimation of parameters and co-clustering is tackled under two approaches: maximum likelihood and classification maximum likelihood. Hard and soft algorithms are described and applied on simulated and real data. Chapter 4 considers co-clustering as a matrix approximation. The trifactorization
approach is considered and algorithms based on update rules are described. Links with numerical and probabi.
Description based on publisher supplied metadata and other sources.
Electronic reproduction. Ann Arbor, Michigan : ProQuest Ebook Central, 2018. Available via World Wide Web. Access may be limited to ProQuest Ebook Central affiliated libraries.
There are no comments on this title.