Efficient Mining Maximum Frequent Pagesets with Double Dwell Time Constraint

J. Ren, X. Zhang (PRC), and T.B. Hodel-Widmer (Switzerland)

Keywords

Dwell time; Maximum frequent pageset; DTFPtree

Abstract

Web usage mining is the application of data mining techniques to large web log database in order to discover frequent pagesets and usage patterns. However, most of the previous researches only focus on the whole database, besides it is unrealistic to mine the full set of frequent pagesets and patterns. So we give the double dwell time to constrain the database according to the decision-maker’s (user’s) mining purpose. Recent work has highlighted the importance of constraint-based Maximum Frequent Pagesets (MFP) mining, thus we design an efficient algorithm named Maximum Frequent PageSet Mining (MFPSM) for mining MFP. According to FP-tree, we present a data structure called Dwell Time Frequent Page tree (DTFP-tree) to store database of session. Using DTFP-tree, we can compress the scale of original FP-tree, and simplify the setup of time thresholds during mining. Our Experiments show that our algorithm can significantly reduce the runtime of mining as long as the decision-makers (users) give the appropriate dwell time constraints, and outperform other algorithms for mining MFP.

Important Links:



Go Back