Computational fundamentals of analyzing and mining data streams, March 2007. Tutorial at Workshop on Data Stream Analysis, Caserta, Italy.

Many scenarios, such as network analysis, utility monitoring, and financial applications, generate massive streams of data. These streams consist of millions or billions of simple updates every hour, and must be processed to extract the information described in tiny pieces. This survey will introduce the problems of data stream monitoring, and some of the techniques that have been developed over recent years to help find the nuggets of gold while avoiding drowning in these massive streams of information.

In particular, this talk will introduce the fundamental techniques used to create compact summaries of data streams: sampling, sketching, and other synopsis techniques. It will describe how to extract features from the stream such as frequent items, medians, and association rules. Lastly, we will see methods to detect when and how the process generating the stream is evolving, indicating some important change has occurred.

bib | slides | .pdf ] Back


This file was generated by bibtex2html 1.92.