diff options
| author | Thibaut Horel <thibaut.horel@gmail.com> | 2018-08-10 15:01:08 -0400 |
|---|---|---|
| committer | Thibaut Horel <thibaut.horel@gmail.com> | 2018-08-10 15:01:08 -0400 |
| commit | 3d7249e4585a3c7bd9976d309b2c6f94a0b507bd (patch) | |
| tree | d3b53c7d99688d47b171d60c8014875b23d7af26 | |
| parent | e1aafe19af108940289d4f977056e27979188df1 (diff) | |
| download | reviews-3d7249e4585a3c7bd9976d309b2c6f94a0b507bd.tar.gz | |
Add JMLR review
| -rw-r--r-- | jmlr-18-385.tex | 155 |
1 files changed, 155 insertions, 0 deletions
diff --git a/jmlr-18-385.tex b/jmlr-18-385.tex new file mode 100644 index 0000000..6a38bff --- /dev/null +++ b/jmlr-18-385.tex @@ -0,0 +1,155 @@ +\documentclass[12pt]{article} +\usepackage[T1]{fontenc} +\usepackage[utf8]{inputenc} +\usepackage[hmargin=1.2in, vmargin=1.2in]{geometry} + +\title{Review of \emph{Approximate Submodular Functions and Performance +Guarantees}} +\date{} +\begin{document} + +\maketitle + +\vspace{-3em} + +\paragraph{Summary.} This paper studies $\delta$-approximate submodular +functions: these are functions $f$ for which there exists a submodular function +whose ``marginal gains'' approximate the marginal gains of $f$ within +a multiplicative factor of $(1\pm\delta)$. The contributions are: +\begin{itemize} + \item this notion of approximate submodularity is related to the already + existing notion of ``submodularity ratio'': there is a necessary condition on + $\delta$ (involving the submodularity ratio) for a function to be + $\delta$-approximate submodular. + \item an approximation bound on the value attained by the greedy algorithm + when maximizing a $\delta$-approximate submodular function is derived. + \item two functions are shown to be $\delta$-approximately submodular: + the trace of inverse Gramian matrix of observations (which is an + important quantity in A-optimal experiment design) and the minimum + eigenvalue of the Gramian. + \item an experiment shows that the approximation bound derived for + $\delta$-submodular functions compares favorably to the one by (Bian et + al., 2017) using the notions of submodularity ratio and curvature. + \item a second experiment shows how to use the notion of + $\delta$-submodularity to solve a problem of sensor selection. +\end{itemize} + + +\paragraph{General comments.} The general topic of approximate submodularity is +becoming increasingly relevant to explain why the greedy heuristic often works +well to maximize non-submodular functions. An important line of work has +focused on the notion of submodularity ratio to measure the divergence from +submodularity. The notion of $\delta$-approximate submodular notion introduced +in this paper proposes an interesting new perspective on the same question. +This definition is justified by the fact that the authors were able to identify +two functions satisfying it. Furthermore this notion has the following +advantage compared to the notion of submodularity ratio: once a submodular $g$ +function has been identified to approximate a given non-submodular function +$f$, one can understand the approximation guarantee of the greedy algorithm +simply by looking at properties of $g$, which is typically easier than +computing the submodularity ratio (which takes exponential time). However both +the theoretical results as well as the experimental results and the exposition +in this paper suffer from inaccuracies and gaps that we now describe. + +\paragraph{ROS is not a characterization.} The authors claim that the Region of +Submodularity (ROS) characterizes when a function is $\delta$-submodular. +However, Lemma 1 only gives a \emph{necessary} condition on $\delta$ for +a function to be $\delta$-submodular. A complete characterization would also +require to show that for any $f$ and $\delta$ satisfying the bound of Lemma 1, +there exists indeed a submodular function which $\delta$-approximates $f$; but +this fact seems false to me or at least hard to prove. For this reason, +I believe that the notion of region of submodularity region is not very +relevant and is a complicated way to express that if a function is +$\delta$-submodular, then it implies a lower bound on its submodularity ratio. +This fact which is important and interesting but I do not believe that the ROS +is the best way to express it (more on this below). + +\paragraph{The bound of Theorem 3 is not tighter than the bound of Theorem 2.} +The authors claim at several points in their paper that the performance +guarantee in Theorem 3 is tighter than the one in Theorem 2 and this is in fact +the purpose of the experiment in section 5.1 to illustrate it. However, +I believe that the bound in Theorem 3 is always worse than the one in Theorem +2, this is made apparent by looking at the proof of both theorems. The proofs +are nearly identical, but Theorem 2 uses $\delta$-submodularity to obtain (1) +a lower bound on the submodularity ratio (2) an upper bound on the curvature. +Since the performance guarantee in Theorem 2 is increasing in the submodularity +ratio and decreasing in the curvature, then the guarantee in Theorem 3 is +implied by the one in Theorem 2 (see also my comments below on how to improve +exposition). As a consequence, I am puzzled by the plot obtained in section 5.1 +since it seems to contradict the theory. Very few details are given on how the +submodularity ratio and curvature were computed, but maybe they weren't +computed properly? + +\paragraph{Computing the performance guarantee does not seem relevant.} The +authors make the following good observation: it is much easier to compute the +performance guarantee given by $\delta$-submodularity rather than to compute the +parameters $\alpha$ and $\gamma$. So even if the performance guarantee is not +as tight, once a $\delta$-approximation has been found, it provides +a convenient way to compute a lower-bound on the approximation guarantee. +However, being able to compute the approximation bounds does not seem relevant +in practice: what truly matters is how well the greedy algorithm performs in +practice, not what is the (worst case) lower bound provided by the theory. + +\paragraph{Experiment 5.2} It is not clear how $\delta$-approximate +submodularity was used in Experiment 5.2: reading the text, it seems that the +authors simply performed the task by using a greedy heuristic to maximize the +minimum eigenvalue of the Gramian; this objective function is not submodular. +As such, the experiment does not seem to illustrate a point relevant to this +paper (but simply shows that using greedy selection on this non-submodular +objective performs better than random). + +\paragraph{Suggestions to rewrite section 3.} Following some of the above +comments, I believe that Section 3 could benefit a lot from being rewritten in +the following way: +\begin{enumerate} + \item restate Lemma 1 as follows: ``If $f$ is $\delta$-submodular, then + $\gamma_f\geq \frac{1-\delta}{1+\delta}$''. It is equivalent to, but + much clearer than the current statement. Also, as a more minor comment, + the proof of Lemma 1 does not need to be by contradiction and a direct + proof would be clearer: + \begin{displaymath} + 1 \leq \min_{S,T}\frac{\sum_{t\in T\setminus S} g_S(t)}{g_S(T)} + \leq \frac{1+\delta}{1-\delta}\min_{S,T}\frac{\sum_{t\in T\setminus + S} f_S(t)}{f_S(T)} + = \gamma_f\cdot\frac{1+\delta}{1-\delta} + \end{displaymath} + where the first inequality is by submodularity and the rest is by + definition. + \item extract the second half of the proof of Theorem 3 in the appendix as + the following Lemma 2: ``let $f$ be monotone and $g$ be monotone + submodular such that $d(f, g)\leq \delta)$, then $\alpha_G(f)\leq + 1 - \frac{1-\delta}{1+\delta}\big(1-\alpha_T(g)\big)$''. + \item present Theorem 2 with the simplified proof using the improved notion + of greedy curvature. + \item present Theorem 3 as a corollary of Theorem 2: by combining Lemma + 1 and Lemma 2 from the first two points: if $d(f, g)\leq \delta$, then + we have a lower bound on the submodularity ratio and an upper bound + on the curvature than we can plug in Theorem 2. + \item remove all mentions of the ROS: instead the whole benefit of this + notion can be summarized by the following remark which can go below + Theorem 3: ``The bound of Theorem 3 shows that if the same function $f$ + can be $\delta$-approximated by multiple submodular functions, then one + should favor the one with the smallest curvature''. +\end{enumerate} + +\paragraph{Minor comments on the writing.} There are many statements of the +form ``$f$ can be represented as/by a $\delta$-approximate submodular function'', +but the definition of $\delta$-approximate submodular function alreadys +contains the fact that $f$ is represented by a function, so I think it would be +cleaner to say ``$f$ is $\delta$-approximate submodular'' (or make the +representation explicit when needed and say ``let $g$ be a submodular function +such that $d(f, g) \leq \delta$''). Furthermore I noticed a non-trivial amount +of grammatical mistakes as I was reading and I would suggest doing more +proofreading passes. + +\paragraph{Conclusion.} While I believe that the notion of +$\delta$-submodularity is an interesting complement to the already existing +notion of submodularity ratio, I think the paper would benefit a lot from +a significant rework: both a rewrite of the Section 3 as suggested above, but +also of the experiment section to fix the afore-mentioned problems. I also +believe that the contributions of this paper, while valuable are a bit too +incremental (in particular, compared to Bian et al, 2017) to justify +a publication in JMLR: the suggested rewrite of Section 3 makes it apparent +that Theorem 3 is a corollary of Theorem 2 and does not provide a tighter +bound. +\end{document} |
