Recommended Readings


Association Rules

·  R. Agrawal, T. Imielinski, and A. Swami. Mining association rules between sets of items in large databases. SIGMOD, 207-216, 1993.

·  R. Agrawal and R. Srikant. Fast algorithms for mining association rules. VLDB, 487-499, 1994.

·  S. Brin, R. Motwani, J. D. Ullman, and S. Tsur. Dynamic itemset counting and implication rules for market basket analysis. SIGMOD, 255-264, 1997.

·  J.S. Park, M.S. Chen, and P.S. Yu. An effective hash-based algorithm for mining association rules. SIGMOD, 175-186, 1995.

·  A. Savasere, E. Omiecinski, and S. Navathe. An efficient algorithm for mining association rules in large databases. VLDB, 432-444, 1995.

·  H. Toivonen. Sampling large databases for association rules. VLDB, 134-145, 1996.

·  M. J. Zaki, S. Parthasarathy, M. Ogihara, and W. Li. Parallel algorithm for discovery of association rules. Data Mining and Knowledge Discovery, 1:343-374, 1997.

·  R. Agarwal, C. Aggarwal, and V. V. V. Prasad. A tree projection algorithm for generation of frequent itemsets. Parallel and Distributed Computing, 61(3):350-371, 2001.

·  J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. SIGMOD, 1-12, 2000.

·  R. J. Bayardo. Efficiently mining long patterns from databases. SIGMOD, 85-93, 1998.

·  N. Pasquier, Y. Bastide, R. Taouil, and L. Lakhal. Discovering frequent closed itemsets for association rules. ICDT, 398-416, 1999.

·  J. Pei, J. Han, and R. Mao. CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets. DMKD, 11-20, 2000.

Sequential Pattern Mining

·  R. Srikant and R. Agrawal. Mining sequential patterns: Generalizations and performance improvements. EDBT, 3-17, 1996.

·  J. Pei, J. Han, H. Pinto, Q. Chen, U. Dayal, and M.-C. Hsu. PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth. ICDE, 215-224, 2001.

·  J. Yang, W. Wang, P. S. Yu, and J. Han. Mining long sequential patterns in a noisy environment. SIGMOD, 406-417, 2002.

·   J. Yang, W. Wang, and P. Yu. InfoMiner: mining surprising periodical patterns. KDD, 2001.

·   J. Yang, W. Wang, and P. Yu. Mining asynchronous periodical patterns in time series data. KDD, 2000.

Mining Graphs

·  X. Yan and J. Han. gSpan: Graph-Based Substructure Pattern Mining. ICDM, 2002.

·   J. Huan, W. Wang, and J. Prins.  Efficient mining of grequent subgraph in the presence of isomorphism. ICDM, 2003.

 

Clustering Part I

·  P. Berkhin. Survey of clustering data mining techniques, 2002.

·  R. Ng and J. Han. Efficient and effective clustering method for spatial data mining. VLDB, 144-155, 1994.

Clustering Part II

·  T. Zhang, R. Ramakrishnan, and M. Livny. BIRCH : an efficient data clustering method for very large databases. SIGMOD, 103-114, 1996.

·  M. Ester, H.-P. Kriegel, J. Sander, and X. Xu. A density-based algorithm for discovering clusters in large spatial databases. KDD, 226-231, 1996.

·  S. Guha, R. Rastogi, and K. Shim. Cure: an efficient clustering algorithm for large databases. SIGMOD, 73-84, 1998.

Clustering Part III

·  W. Wang, J. Yang, and R. Muntz. STING: a statistical information grid approach to spatial data mining. VLDB, 186-195, 1997.

·  G. Sheikholeslami, S. Chatterjee, and A. Zhang. WaveCluster: a multi-resolution clustering approach for very large spatial databases. VLDB, 428-439, 1998.

·  R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan. Automatic subspace clustering of high dimensional data for data mining. SIGMOD, 94-105, 1998.

Clustering Part IV

·  J. Yang, W. Wang, H. Wang, and P. Yu. Delta-cluster: capturing subspace correlation in a large data set. ICDE, 517-528, 2002.

·  H. Wang, W. Wang, J. Yang, and P. Yu. Clustering by pattern similarity in large data sets. SIGMOD, 394-405, 2002.

 

 

Presentation Papers:

·         Davidson I., Ravi, S.S., Clustering under Constraints: Feasibility Issues and the K-Means Algorithm, 5th SIAM Data Mining Conference, 2005.

·         Davidson I., Wagstaff, K., Basu, Sugato., Measuring Constraint-Set Utility for Partitional Clustering Algorithms , To Appear in the Proceeding of ECML/PKDD 2006.

·         Jinze Liu, Qi Zhang, Wei Wang, Leonard Mcmillan, and Jan Prins. Clustering pair-wise dissimilarity data into partially ordered sets. Proceedings of KDD, 637-642, 2006.

·         Robust Information-theoretic Clustering, by Christian Böhm, Christos Faloutsos, Jia-Yu Pan, Claudia Plant, Proceedings of KDD, 2006.

·         Deriving Quantitative Models for Correlation Clusters, Elke Achtert, Christian Böhm, Hans-Peter Kriegel, Peer Kröger, Arthur Zimek, Proceedings of KDD, 2006.

·         Orthogonal Nonnegative Matrix Tri-factorizations for Clustering, Chris Ding, Tao Li, Wei Peng, Haesun Park, Proceedings of KDD, 2006.

 

 

Classifications

·         J. Gehrke, R. Ramakrishnan, V. Ganti. Rainforest: A framework for fast decision tree construction of large datasets. Journal of Data Mining and Knowledge Discovery, 2000.

·         B. Liu, W. Hsu, and Y. Ma. Integrating classification and association rule mining. KDD, 1998.

 

Network Analysis

·         S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine, World Wide Web Conference, 1998.

·         S. White and P. Smyth. Algorithms for estimating relative importance in networks, KDD, 2003.

·         X. Yin, J. Han, J. Yang, and P. Yu. CrossMine: Efficient Multi-relational Classification, ICDE, 2004.

 

Presentation Papers:

·         Eitan Hirsh and Roded Sharan. Identification of conserved protein complexes based on a model of protein network evolution. Bioinformatics Journal, http://bioinformatics.oxfordjournals.org/cgi/content/full/23/2/e170

·         Roded Sharan, Silpa Suthram , Ryan M. Kelley , Tanja Kuhn , Scott McCuine, Peter Uetz , Taylor Sittler , Richard M. Karp , and Trey Ideker. Conserved patterns of protein interaction in multiple species, Bioinformatics Journal. http://www.pnas.org/cgi/content/abstract/102/6/1974

·         Koyuturk et.al. Pairwise Alignment of Protein Interaction Networks.
http://www.liebertonline.com/doi/abs/10.1089/cmb.2006.13.182

·         J. Liu, S. Paulsen, X. Sun, W. Wang, A. Nobel, J. Prins. Mining approximate frequent itemsets in the presence of noise: algorithm and analysis, SDM 2006.