We study the problem of pre-computing auxillary information to support on-line range queries for the sum and max functions on a datacube. For a d-dimensional datacube with size n in each dimension, we propose a dynamic range max data structure with O( d (s; n)L d ) query time, O( d (s; n)L 2d n d=L
Range queries in dynamic OLAP data cubes
โ Scribed by Weifa Liang; Hui Wang; Maria E. Orlowska
- Publisher
- Elsevier Science
- Year
- 2000
- Tongue
- English
- Weight
- 776 KB
- Volume
- 34
- Category
- Article
- ISSN
- 0169-023X
No coin nor oath required. For personal study only.
โฆ Synopsis
A range query applies an aggregation operation (e.g., SUM) over all selected cells of an OLAP data cube where the selection is speciยฎed by providing ranges of values for numeric dimensions. Range sum queries on data cubes are a powerful analysis tool. Many application domains require that data cubes are updated often and the information provided by analysis tools are current or ``near current''. Existing techniques for range sum queries on data cubes, however, can incur update costs in the order of the size of the data cube. Since the size of a data cube is exponential in the number of its dimensions, rebuilding the entire data cube can be very costly and is not realistic. To cope with this dynamic data cube problem, a new approach has been introduced recently, which achieves constant time per range sum query while constraining each update cost within On da2 , where d is the number of dimensions of the data cube and n is the number of distinct values of the domain at each dimension. In this paper, we provide a new algorithm for the problem which requires On 1a3 time for each range sum query and On da3 time for each update. Our algorithm improves the update time by a factor of On da6 in contrast to the current one for the problem On da2 . Like all existing techniques, our approach to answering range sum queries is also based on some precomputed auxiliary information (preยฎx sums) that is used to answer ad hoc queries at run time. Under both the product model and a new model introduced in this paper, the total cost for updates and range queries of the proposed algorithm is smallest compared with the cost by all known algorithms. Therefore our algorithm reduces the overall time complexity for range sum queries signiยฎcantly.
๐ SIMILAR VOLUMES
## Abstract Parallel processing is a key to high performance in very large data warehouse applications that execute complex analytical queries on huge amounts of data. Although parallel database systems (PDBSs) have been studied extensively in the past decades, the specifics of load balancing in pa
## Abstract Satellite laser ranging (SLR) data collected at sites around the world between 1984 and 1990 by the WEGENER/MEDLAS Group (Working Group of European Geoscientists for the Establishment of Networks for Earthquake Research/MEDiterranean Laser Ranging) have been processed to give six annual