Many algorithms have been proposed to approximate holistic ag- gregates, such as quantiles and heavy hitters, over data streams. However, little work has been done to explore what techniques are required to incorporate these algorithms in a data stream query pro- cessor, and to make them useful in practice. In this paper, we study the performance implications of using user-dened aggregate functions (UDAFs) to incorporate selection- based and sketch-based algorithms for holistic aggregates into a data stream management system's query processing architecture. We identify key performance bottlenecks and tradeoffs, and pro- pose novel techniques to make these holistic UDAFs fast and space- efcient for use in high-speed data stream applications. We evalu- ate performance using generated and actual IP packet data, focus- ing on approximating quantiles and heavy hitters. The best of our current implementations can process streaming queries at OC48 speeds (2x 2.4Gbps).
[ bib | http | .pdf ] Back
This file was generated by bibtex2html 1.92.