Madeira, S.C., Oliveira, A.L.: Biclustering algorithms for biological data analysis: A survey. Liu, G., Li, J., Sim, K., Wong, L.: Distance based subspace clustering with flexible dimension partitioning. International Journal of Data Mining and Bioinformatics 1(2), 138–149 (2006) Li, X., Tan, S.H., Ng, S.-K.: Improving domain based protein interaction prediction using biologically significant negative dataset. Ke, Y., Cheng, J., Ng, W.: Mining quantitative correlated patterns using an information-theoretic approach. Grahne, G., Zhu, J.: Efficiently using prefix-trees in mining frequent itemsets.
Gao, B., Liu, T.-Y., Ma, W.-Y.: Star-structured high-order heterogeneous data co-clustering based on consistent information theory. Genome Biology 3, R23 (2002)ĭenis, D.J., Denis, D.K.: Performance changes following top management dismissals. 455–466 (2004)īreitkreutz, B.-J., Stark, C., Tyers, M.: The grid: The general repository for interaction datasets. Nature Genetics 25(1), 25–29 (2000)Īsuncion, A., Newman, D.: UCI Machine Learning Repository (2007), īöhm, C., Kailing, K., Kröger, P., Zimek, A.: Computing clusters of correlation connected objects. We demonstrate the mining efficiency of MACminer in datasets with multiple attributes, and show that MACs with high correlation information have higher classification and predictive power, as compared to MACs generated by alternative high-dimensional data clustering and pattern mining techniques.Īshburner, M., et al.: Gene ontology: tool for the unification of biology. We develop a novel algorithm MACminer to mine MACs with high correlation information from datasets. The generalized formula enables us to use correlation information to discover multi-attribute co-clusters (MACs).
In this paper, we introduce a generalization of the mutual information between two attributes into mutual information between two attribute sets. We denote this co-clustering problem as the multi-attribute co-clustering problem. As such, there is a need to co-cluster multiple attributes’ values into pairs of highly correlated clusters. In many real-world applications that analyze correlations between two groups of diverse entities, each group of entities can be characterized by multiple attributes.