简介:吴月华,加拿大约克大学(york university)统计系教授;1989年获得美国匹兹堡大学统计学博士学位,师从世界著名统计学家c.r. rao。吴教授研究领域包括金融统计、空间统计、高维数据统计、变点检验以及在环境科学等交叉学科。目前当选国际统计学会的会员(elected member of international statistical institute),承担多项加拿大政府重要科研项目,发表学术论文百余篇,其中包括5篇国际最顶级期刊proceedings of the national academy of sciences of the united states of america(pnas,美国国家科学院院刊)论文。
报告题目:association rule mining and market basket analysis
教授观点:current algorithms for association rule mining from transaction data are mostly deterministic and enumerative. they can be computationally intractable even for mining a dataset containing just a few hundred transaction items, if no action is taken to constrain the search space. in this talk, we first briefly review the apriori algorithm, and then introduce a gibbs-sampling-induced stochastic search procedure to randomly sample association rules from the itemset space, and perform rule mining from the reduced transaction dataset generated by the sample. a general rule importance measure is also proposed to direct the stochastic search so that, as a result of the randomly generated association rules constituting an ergodic markov chain, the overall most important rules in the itemset space can be uncovered from the reduced dataset with probability 1 in the limit. we end the talk by presenting some data examples.