diff options
| author | Thibaut Horel <thibaut.horel@gmail.com> | 2014-11-03 10:11:48 -0500 |
|---|---|---|
| committer | Thibaut Horel <thibaut.horel@gmail.com> | 2014-11-03 10:11:48 -0500 |
| commit | ee5d2b6954ac2635b65cb75afd3d088692e43b1e (patch) | |
| tree | fa8d196177e4702c59eef24f7aafbbb3d83a5e46 /mapreduce-sub/main.tex | |
| parent | 59560fca8caa4d76abe2fe8597d806e0a897874a (diff) | |
| download | cs284-ee5d2b6954ac2635b65cb75afd3d088692e43b1e.tar.gz | |
Add reading notes on submodular MR
Diffstat (limited to 'mapreduce-sub/main.tex')
| -rw-r--r-- | mapreduce-sub/main.tex | 60 |
1 files changed, 60 insertions, 0 deletions
diff --git a/mapreduce-sub/main.tex b/mapreduce-sub/main.tex new file mode 100644 index 0000000..2566ff2 --- /dev/null +++ b/mapreduce-sub/main.tex @@ -0,0 +1,60 @@ +\documentclass[10pt]{article} +\usepackage{amsmath,amsfonts,amsthm, amssymb} + + +\DeclareMathOperator*{\E}{\mathbb{E}} +\let\Pr\relax +\DeclareMathOperator*{\Pr}{\mathbb{P}} +\newcommand{\inprod}[1]{\left\langle #1 \right\rangle} +\newcommand{\eqdef}{\mathbin{\stackrel{\rm def}{=}}} + +\newtheorem{theorem}{Theorem} +\newtheorem{lemma}{Lemma} + +\author{Thibaut Horel} +\title{CS 284r -- Fast Greedy Algorithms in MapReduce and Streaming -- Reading +notes} + +\begin{document} + +\maketitle + +This paper proposes several MapReduce algorithms for various submodular +maximization problems. Specifically, they consider the following problems: +maximize a monotone submodular function under a cardinality constraint, +maximize a modular function under a matroid constraint, maximize a monotone +submodular function under one or several knapsack constraints, or under +$p$-system constraints. + +At a high-level, all their algorithms rely on the \textsc{Sample\&Prune} +method: this method is a parallelizable way of selecting good solutions for +a function $f$. In this method, a seed solution is obtained through sampling +elements from a universe (the sampling parameter is chosen so that the sampled +data fits in memory), then the universe is pruned by removing all the elements +which cannot improve the quality of the sampled solution. + +For submodular maximization under a cardinality constraint, +\textsc{Sample \& Prune} is used on the thresholding algorithm simulating the +$\epsilon$-greedy algorithm. The function on which \textsc{Sample\&Prune} is +applied is simply the function which selects elements whose marginal +contribution is larger than the current threshold. + +For modular maximization under a matroid constraint, \textsc{Sample \& Prune} +can be used to simulate the standard greedy algorithm which is known to return +the optimal solution. This case is much simpler than the submodular +maximization case in that the marginal contribution of the elements does not +vary throughout the algorithm, and the thresholding method is not required. +Similar ideas are used for knapsack-constraints and $p$-systems constraints. +Finally experiments show how the quality of the solution progresses as the +number of rounds increase. + +General impression: natural generalization of the Max-Cover paper. +\textbf{However}, they claim an improvement in the number of rounds required: +this is hiding the fact that they implicitly assume an access to an oracle that +computes the marginal contributions, hence their is more work to be done in +each round (the MapReduce framework only requires that the running time of +a round is polynomial). The experiments do not show a running time speedup. +Open question: it would be interesting to see whether the partial enumeration +method for knapsack constraints can be turned into a map-reduce algorithm. + +\end{document} |
