summaryrefslogtreecommitdiffstats
path: root/mapreduce-sub/main.tex
diff options
context:
space:
mode:
authorThibaut Horel <thibaut.horel@gmail.com>2014-11-03 10:11:48 -0500
committerThibaut Horel <thibaut.horel@gmail.com>2014-11-03 10:11:48 -0500
commitee5d2b6954ac2635b65cb75afd3d088692e43b1e (patch)
treefa8d196177e4702c59eef24f7aafbbb3d83a5e46 /mapreduce-sub/main.tex
parent59560fca8caa4d76abe2fe8597d806e0a897874a (diff)
downloadcs284-ee5d2b6954ac2635b65cb75afd3d088692e43b1e.tar.gz
Add reading notes on submodular MR
Diffstat (limited to 'mapreduce-sub/main.tex')
-rw-r--r--mapreduce-sub/main.tex60
1 files changed, 60 insertions, 0 deletions
diff --git a/mapreduce-sub/main.tex b/mapreduce-sub/main.tex
new file mode 100644
index 0000000..2566ff2
--- /dev/null
+++ b/mapreduce-sub/main.tex
@@ -0,0 +1,60 @@
+\documentclass[10pt]{article}
+\usepackage{amsmath,amsfonts,amsthm, amssymb}
+
+
+\DeclareMathOperator*{\E}{\mathbb{E}}
+\let\Pr\relax
+\DeclareMathOperator*{\Pr}{\mathbb{P}}
+\newcommand{\inprod}[1]{\left\langle #1 \right\rangle}
+\newcommand{\eqdef}{\mathbin{\stackrel{\rm def}{=}}}
+
+\newtheorem{theorem}{Theorem}
+\newtheorem{lemma}{Lemma}
+
+\author{Thibaut Horel}
+\title{CS 284r -- Fast Greedy Algorithms in MapReduce and Streaming -- Reading
+notes}
+
+\begin{document}
+
+\maketitle
+
+This paper proposes several MapReduce algorithms for various submodular
+maximization problems. Specifically, they consider the following problems:
+maximize a monotone submodular function under a cardinality constraint,
+maximize a modular function under a matroid constraint, maximize a monotone
+submodular function under one or several knapsack constraints, or under
+$p$-system constraints.
+
+At a high-level, all their algorithms rely on the \textsc{Sample\&Prune}
+method: this method is a parallelizable way of selecting good solutions for
+a function $f$. In this method, a seed solution is obtained through sampling
+elements from a universe (the sampling parameter is chosen so that the sampled
+data fits in memory), then the universe is pruned by removing all the elements
+which cannot improve the quality of the sampled solution.
+
+For submodular maximization under a cardinality constraint,
+\textsc{Sample \& Prune} is used on the thresholding algorithm simulating the
+$\epsilon$-greedy algorithm. The function on which \textsc{Sample\&Prune} is
+applied is simply the function which selects elements whose marginal
+contribution is larger than the current threshold.
+
+For modular maximization under a matroid constraint, \textsc{Sample \& Prune}
+can be used to simulate the standard greedy algorithm which is known to return
+the optimal solution. This case is much simpler than the submodular
+maximization case in that the marginal contribution of the elements does not
+vary throughout the algorithm, and the thresholding method is not required.
+Similar ideas are used for knapsack-constraints and $p$-systems constraints.
+Finally experiments show how the quality of the solution progresses as the
+number of rounds increase.
+
+General impression: natural generalization of the Max-Cover paper.
+\textbf{However}, they claim an improvement in the number of rounds required:
+this is hiding the fact that they implicitly assume an access to an oracle that
+computes the marginal contributions, hence their is more work to be done in
+each round (the MapReduce framework only requires that the running time of
+a round is polynomial). The experiments do not show a running time speedup.
+Open question: it would be interesting to see whether the partial enumeration
+method for knapsack constraints can be turned into a map-reduce algorithm.
+
+\end{document}