Efficient Skyline Computation in MapReduce

Kasper Mullesgaard, Jens Laurits Pedersen, Hua Lu, Yongluan Zhou

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Abstract

Skyline queries are useful for finding interesting tuples from
a large data set according to multiple criteria. The sizes of
data sets are constantly increasing and the architecture of
back-ends are switching from single-node environments to
non-conventional paradigms like MapReduce. Despite the
usefulness of skyline queries, existing works on skyline com-
putation in MapReduce do not take full advantage of paral-
lelism but still run significant parts serially. In this paper,
we propose a novel approach to compute skylines efficiently
in MapReduce. We design a grid partitioning scheme to di-
vide the data space into partitions, and employ a bitstring to
represent the partitions. The bitstring is efficiently obtained
in MapReduce, and it clearly helps prune partitions (and
tuples) that cannot have skyline tuples. Based on the grid
partitioning, we propose two MapReduce algorithms to com-
pute skylines. Both algorithms utilize the bitstring and dis-
tribute the original tuples to multiple mappers and make use
of them to compute local skylines in parallel. In particular,
MapReduce Grid Partitioning based Single-Reducer Skyline
Computation (MR-GPSRS) employs a single reducer to as-
semble the local skylines appropriately to compute the glob-
al skyline. In contrast, MapReduce Grid Partitioning based
Multiple Reducer Skyline Computation (MR-GPMRS) fur-
ther divides local skylines and distributes them to multiple
reducers that compute the global skyline in an independent
and parallel manner. The proposed algorithms are evalu-
ated through extensive experiments, and the results show
that MR-GPMRS significantly outperforms the alternatives
in various settings
Original languageEnglish
Title of host publicationAdvances in Database Technology — EDBT 2014 : 17th International Conference on Extending Database Technology Athens, Greece, March 24–28, 2014 Proceedings
EditorsSihem Amer-Yahia, Vassilis Christophides, Anastasios Kementsietsidis, Minos N. Garofalakis, Stratos Idreos, Vincent Leroy
Number of pages12
PublisherOpenProceedings.org
Publication date2014
Pages37-48
ISBN (Electronic)978-3-89318065-3
DOIs
Publication statusPublished - 2014
Externally publishedYes
Event17th International Conference on Extending Database Technology - Athen, Greece
Duration: 24 Mar 201428 Mar 2014
Conference number: 17

Conference

Conference17th International Conference on Extending Database Technology
Number17
Country/TerritoryGreece
CityAthen
Period24/03/201428/03/2014

Cite this