Uppsala universitet
Hoppa över länkar

Information Technology

UDBL Home

People
Research
Publications
Theses
Openings
Master Projects
Contact
Amos II
Wrappers

SCSQ - SuperComputer Stream Query processor

This work was funded by VINNOVA, SSF, and ASTRON.

Instruments, such as radio telescopes, colliders, sensor networks, loggers, and simulators generate very high volumes of data streams that scientists, engineers, and monitoring systems analyze to detect and understand physical phenomena. The data volume in these data streams is often very high and there is need for advanced computations on the streams. This requires substantial hardware resources and scalable stream processing.

We address these challenges by developing a data stream management system SCSQ (pronounced 'sisque', Supercomputer Stream Query processor) which is a Data Stream Management System (DSMS) that enables queries over high-volume distributed streams. We have developed a SCSQ prototype that runs on a variety of hardware platforms, from Windows to IBM BlueGene massively parallel computers. SCSQ enables high level specification of distributed stream queries involving advanced computations in such heterogeneous communication and computation environments. SCSQ queries filter, transform, and join data from different kinds of distributed streaming data sources.

An important target application for the SCSQ technology was LOFAR, a large digital radio telescope being developed in the Netherlands. An antenna array distributed over the Netherlands and Germany produces massive amounts of data which is streamed through heterogeneous cluster computers that include Linux clusters and a 12000 nodes IBM Bluegene. SCSQ runs in this massively parallel and heterogeneous computing environment. SCSQ optimizes and executes data stream queries from digital receivers of the LOFAR space radio signals.

The SCSQ prototype has been evaluated using the Linear Road Benchmark, which is a simulation of a toll expressway system producing data streams to be processed by a data stream management system. The implementation is called SCSQ-LR.

The SCSQ prototype is being further developed in the iStreams project where data stream management techniques are applied on searching and analyzing industrial streams.

New: The massively scalable parallel implementation of Linear Road SCSQ-PLR now is network bound and achieves orders of magnitude improved scalability (L>512) over any previously published results for the Linear Road Benchmark:

E.Zeitler and T.Risch: Massive scale-out of expensive continuous queries, presented at 37th International Conference on Very Large Databases, VLDB 2011, in Proceedings of the VLDB Endowment, Vol. 4, No. 11, 2011.

E. Zeitler and T.Risch: Scalable Splitting of Massive Data Streams, in Proc. 15th Conf. on Database Systems for Advanced Application, DASFAA 2010., Tokyo, Japan, 1-4 April, 2010 (abstract).

Publications

There first overview of the SCSQ project was made in:

The following paper shows how the flexibility of the SCSQ query language (SCSQL, pronounced 'Siskel') can be used for investigating the performance of a heterogeneous and massively parallel computer environment:

SCSQ was applied on large scale collective traffic systems in:


There are popular presentations in
ASTRON News and European Space Agency in Sweden.


People

Responsible for this project is Tore Risch. It is the basis for the PhD work of Erik Zeitler.

© 2007 Uppsala Universitet, Department of Information Technology, Box 337, 751 05 Uppsala, Sweden | This page is maintained by Tore Risch