Efficient Skyline Computation in MapReduce

Kasper Mullesgaard, Jens Laurits Pedersen, Hua Lu, Yongluan Zhou

Publikation: Bidrag til bog/antologi/rapportKonferencebidrag i proceedingsForskningpeer review

Abstract

Skyline queries are useful for finding interesting tuples from
a large data set according to multiple criteria. The sizes of
data sets are constantly increasing and the architecture of
back-ends are switching from single-node environments to
non-conventional paradigms like MapReduce. Despite the
usefulness of skyline queries, existing works on skyline com-
putation in MapReduce do not take full advantage of paral-
lelism but still run significant parts serially. In this paper,
we propose a novel approach to compute skylines efficiently
in MapReduce. We design a grid partitioning scheme to di-
vide the data space into partitions, and employ a bitstring to
represent the partitions. The bitstring is efficiently obtained
in MapReduce, and it clearly helps prune partitions (and
tuples) that cannot have skyline tuples. Based on the grid
partitioning, we propose two MapReduce algorithms to com-
pute skylines. Both algorithms utilize the bitstring and dis-
tribute the original tuples to multiple mappers and make use
of them to compute local skylines in parallel. In particular,
MapReduce Grid Partitioning based Single-Reducer Skyline
Computation (MR-GPSRS) employs a single reducer to as-
semble the local skylines appropriately to compute the glob-
al skyline. In contrast, MapReduce Grid Partitioning based
Multiple Reducer Skyline Computation (MR-GPMRS) fur-
ther divides local skylines and distributes them to multiple
reducers that compute the global skyline in an independent
and parallel manner. The proposed algorithms are evalu-
ated through extensive experiments, and the results show
that MR-GPMRS significantly outperforms the alternatives
in various settings
OriginalsprogEngelsk
TitelAdvances in Database Technology — EDBT 2014 : 17th International Conference on Extending Database Technology Athens, Greece, March 24–28, 2014 Proceedings
RedaktørerSihem Amer-Yahia, Vassilis Christophides, Anastasios Kementsietsidis, Minos N. Garofalakis, Stratos Idreos, Vincent Leroy
Antal sider12
ForlagOpenProceedings.org
Publikationsdato2014
Sider37-48
ISBN (Elektronisk)978-3-89318065-3
DOI
StatusUdgivet - 2014
Udgivet eksterntJa
Begivenhed17th International Conference on Extending Database Technology - Athen, Grækenland
Varighed: 24 mar. 201428 mar. 2014
Konferencens nummer: 17

Konference

Konference17th International Conference on Extending Database Technology
Nummer17
Land/OmrådeGrækenland
ByAthen
Periode24/03/201428/03/2014

Citer dette