Detecting the core structure of a database is one
of the most objective of data mining. Many methods do so,
in pattern set mining, by mining a small set of patterns that
together summarize the dataset in efficient way. The better of
these patterns, the more effective summarization of the database.
Most of these methods are based on the Minimum Description
Length principle. Here, we focus on the event sequence database.
In this paper, rather than mining a small set of significant
patterns, we propose a novel method to summarize the event
sequence dataset by constructing compact big sequence namely,
BigSeq. BigSeq conserves all characteristics of the original event
sequences. It is constructed in efficient way via the longest
common subsequence and the novel definition of the compatible
event set. The experimental results show that BigSeq method
outperforms the state-of-the-art methods such as Gokrimp with
respect to compression ratio, total response time, and number of
detected patterns. |