Abstract
Skyline queries are useful for finding interesting tuples from
a large data set according to multiple criteria. The sizes of
data sets are constantly increasing and the architecture of
back-ends are switching from single-node environments to
non-conventional paradigms like MapReduce. Despite the
usefulness of skyline queries, existing works on skyline com-
putation in MapReduce do not take full advantage of paral-
lelism but still run significant parts serially. In this paper,
we propose a novel approach to compute skylines efficiently
in MapReduce. We design a grid partitioning scheme to di-
vide the data space into partitions, and employ a bitstring to
represent the partitions. The bitstring is efficiently obtained
in MapReduce, and it clearly helps prune partitions (and
tuples) that cannot have skyline tuples. Based on the grid
partitioning, we propose two MapReduce algorithms to com-
pute skylines. Both algorithms utilize the bitstring and dis-
tribute the original tuples to multiple mappers and make use
of them to compute local skylines in parallel. In particular,
MapReduce Grid Partitioning based Single-Reducer Skyline
Computation (MR-GPSRS) employs a single reducer to as-
semble the local skylines appropriately to compute the glob-
al skyline. In contrast, MapReduce Grid Partitioning based
Multiple Reducer Skyline Computation (MR-GPMRS) fur-
ther divides local skylines and distributes them to multiple
reducers that compute the global skyline in an independent
and parallel manner. The proposed algorithms are evalu-
ated through extensive experiments, and the results show
that MR-GPMRS significantly outperforms the alternatives
in various settings
a large data set according to multiple criteria. The sizes of
data sets are constantly increasing and the architecture of
back-ends are switching from single-node environments to
non-conventional paradigms like MapReduce. Despite the
usefulness of skyline queries, existing works on skyline com-
putation in MapReduce do not take full advantage of paral-
lelism but still run significant parts serially. In this paper,
we propose a novel approach to compute skylines efficiently
in MapReduce. We design a grid partitioning scheme to di-
vide the data space into partitions, and employ a bitstring to
represent the partitions. The bitstring is efficiently obtained
in MapReduce, and it clearly helps prune partitions (and
tuples) that cannot have skyline tuples. Based on the grid
partitioning, we propose two MapReduce algorithms to com-
pute skylines. Both algorithms utilize the bitstring and dis-
tribute the original tuples to multiple mappers and make use
of them to compute local skylines in parallel. In particular,
MapReduce Grid Partitioning based Single-Reducer Skyline
Computation (MR-GPSRS) employs a single reducer to as-
semble the local skylines appropriately to compute the glob-
al skyline. In contrast, MapReduce Grid Partitioning based
Multiple Reducer Skyline Computation (MR-GPMRS) fur-
ther divides local skylines and distributes them to multiple
reducers that compute the global skyline in an independent
and parallel manner. The proposed algorithms are evalu-
ated through extensive experiments, and the results show
that MR-GPMRS significantly outperforms the alternatives
in various settings
Originalsprog | Engelsk |
---|---|
Titel | Advances in Database Technology — EDBT 2014 : 17th International Conference on Extending Database Technology Athens, Greece, March 24–28, 2014 Proceedings |
Redaktører | Sihem Amer-Yahia, Vassilis Christophides, Anastasios Kementsietsidis, Minos N. Garofalakis, Stratos Idreos, Vincent Leroy |
Antal sider | 12 |
Forlag | OpenProceedings.org |
Publikationsdato | 2014 |
Sider | 37-48 |
ISBN (Elektronisk) | 978-3-89318065-3 |
DOI | |
Status | Udgivet - 2014 |
Udgivet eksternt | Ja |
Begivenhed | 17th International Conference on Extending Database Technology - Athen, Grækenland Varighed: 24 mar. 2014 → 28 mar. 2014 Konferencens nummer: 17 |
Konference
Konference | 17th International Conference on Extending Database Technology |
---|---|
Nummer | 17 |
Land/Område | Grækenland |
By | Athen |
Periode | 24/03/2014 → 28/03/2014 |