summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorThibaut Horel <thibaut.horel@gmail.com>2014-11-05 09:42:25 -0500
committerThibaut Horel <thibaut.horel@gmail.com>2014-11-05 09:42:25 -0500
commit00c953741c8a87ed8e970742514ae0b0a5c7a49e (patch)
treec6e31f6caf0527a46d119301a120764cff4a6116
parentee5d2b6954ac2635b65cb75afd3d088692e43b1e (diff)
downloadcs284-00c953741c8a87ed8e970742514ae0b0a5c7a49e.tar.gz
Add reading notes on distributed submodular maximizationHEADmaster
-rw-r--r--distributed-submodular/main.tex59
1 files changed, 59 insertions, 0 deletions
diff --git a/distributed-submodular/main.tex b/distributed-submodular/main.tex
new file mode 100644
index 0000000..3cc2535
--- /dev/null
+++ b/distributed-submodular/main.tex
@@ -0,0 +1,59 @@
+\documentclass[10pt]{article}
+\usepackage{amsmath,amsfonts,amsthm, amssymb}
+
+
+\DeclareMathOperator*{\E}{\mathbb{E}}
+\let\Pr\relax
+\DeclareMathOperator*{\Pr}{\mathbb{P}}
+\newcommand{\inprod}[1]{\left\langle #1 \right\rangle}
+\newcommand{\eqdef}{\mathbin{\stackrel{\rm def}{=}}}
+
+\newtheorem{theorem}{Theorem}
+\newtheorem{lemma}{Lemma}
+
+\author{Thibaut Horel}
+\title{CS 284r -- Distributed Submodular Maximization -- Reading
+notes}
+
+\begin{document}
+
+\maketitle
+
+This article proposes a new algorithm for submodular maximization under
+a $k$ cardinality constraint in a parallel way. In particular they propose an
+algorithm whose approximation ratio is $(1-1/e)^2/\min(m, k)$, where $m$ is the
+number of groups in which the ground set will be partitioned ($m$ is typically
+the number of machines used in a parallel computation).
+
+The idea of the algorithm is very simple: partition the ground set into $m$
+groups. Using the greedy algorithm, find an approximate solution of size $k$ on
+each machine (on each subset of the partition). Then merge the solutions and
+run the greedy algorithm on the merge set (of size at most $mk$) to extract
+a solution of size $k$.
+
+For submodular functions acting on a ground set which is embedded in a metric
+space (as is usual for submodular functions coming from machine learning
+applications, the authors give two examples of such functions), then one can
+prove that if the function satisfies a Lipschitz property (a natural extension
+of the Lipschitz property for set functions of this type) and if the optimal
+solution is not too ``isolated'' (i.e there are enough ground set elements in
+the neighborhood of the optimal solution) then the approximation ratio is
+$(1-1/e)^2$ (up to an additive term) when the elements of the ground set are
+partitioned uniformly at random.
+
+An experimental section shows that the presented algorithm performs well
+compared to other submodular maximization approaches.
+
+Overall impression : simple results. The approximation ratio are not very good
+if you don't assume the geometric structure of the optimal solution. The paper
+doesn't give a sense of how common this assumption is. They only study the
+example of exemplar based clustering, but many other examples of subdmodular
+funtions in machine learning exist. It is a bit worrying that they don't study
+those, should we assume that they don't satisfy the geometric structure
+property?
+
+Open question: on the theoretical side, it looks like better approximation
+ratios could be obtained in cases where we have bounds on the curvature of the
+submodular function. It would be curious to see if this is true.
+
+\end{document}