\documentclass[12pt]{article} \usepackage[T1]{fontenc} \usepackage[utf8]{inputenc} \usepackage[hmargin=1.2in, vmargin=1.2in]{geometry} \title{Review of \emph{Approximate Submodular Functions and Performance Guarantees}} \date{} \begin{document} \maketitle \vspace{-3em} \paragraph{Summary.} This paper studies $\delta$-approximate submodular functions: these are functions $f$ for which there exists a submodular function whose ``marginal gains'' approximate the marginal gains of $f$ within a multiplicative factor of $(1\pm\delta)$. The contributions are: \begin{itemize} \item this notion of approximate submodularity is related to the already existing notion of ``submodularity ratio'': there is a necessary condition on $\delta$ (involving the submodularity ratio) for a function to be $\delta$-approximate submodular. \item an approximation bound on the value attained by the greedy algorithm when maximizing a $\delta$-approximate submodular function is derived. \item two functions are shown to be $\delta$-approximately submodular: the trace of inverse Gramian matrix of observations (which is an important quantity in A-optimal experiment design) and the minimum eigenvalue of the Gramian. \item an experiment shows that the approximation bound derived for $\delta$-submodular functions compares favorably to the one by (Bian et al., 2017) using the notions of submodularity ratio and curvature. \item a second experiment shows how to use the notion of $\delta$-submodularity to solve a problem of sensor selection. \end{itemize} \paragraph{General comments.} The general topic of approximate submodularity is becoming increasingly relevant to explain why the greedy heuristic often works well to maximize non-submodular functions. An important line of work has focused on the notion of submodularity ratio to measure the divergence from submodularity. The notion of $\delta$-approximate submodular notion introduced in this paper proposes an interesting new perspective on the same question. This definition is justified by the fact that the authors were able to identify two functions satisfying it. Furthermore this notion has the following advantage compared to the notion of submodularity ratio: once a submodular $g$ function has been identified to approximate a given non-submodular function $f$, one can understand the approximation guarantee of the greedy algorithm simply by looking at properties of $g$, which is typically easier than computing the submodularity ratio (which takes exponential time). However both the theoretical results as well as the experimental results and the exposition in this paper suffer from inaccuracies and gaps that we now describe. \paragraph{ROS is not a characterization.} The authors claim that the Region of Submodularity (ROS) characterizes when a function is $\delta$-submodular. However, Lemma 1 only gives a \emph{necessary} condition on $\delta$ for a function to be $\delta$-submodular. A complete characterization would also require to show that for any $f$ and $\delta$ satisfying the bound of Lemma 1, there exists indeed a submodular function which $\delta$-approximates $f$; but this fact seems false to me or at least hard to prove. For this reason, I believe that the notion of region of submodularity region is not very relevant and is a complicated way to express that if a function is $\delta$-submodular, then it implies a lower bound on its submodularity ratio. This fact which is important and interesting but I do not believe that the ROS is the best way to express it (more on this below). \paragraph{The bound of Theorem 3 is not tighter than the bound of Theorem 2.} The authors claim at several points in their paper that the performance guarantee in Theorem 3 is tighter than the one in Theorem 2 and this is in fact the purpose of the experiment in section 5.1 to illustrate it. However, I believe that the bound in Theorem 3 is always worse than the one in Theorem 2, this is made apparent by looking at the proof of both theorems. The proofs are nearly identical, but Theorem 2 uses $\delta$-submodularity to obtain (1) a lower bound on the submodularity ratio (2) an upper bound on the curvature. Since the performance guarantee in Theorem 2 is increasing in the submodularity ratio and decreasing in the curvature, then the guarantee in Theorem 3 is implied by the one in Theorem 2 (see also my comments below on how to improve exposition). As a consequence, I am puzzled by the plot obtained in section 5.1 since it seems to contradict the theory. Very few details are given on how the submodularity ratio and curvature were computed, but maybe they weren't computed properly? \paragraph{Computing the performance guarantee does not seem relevant.} The authors make the following good observation: it is much easier to compute the performance guarantee given by $\delta$-submodularity rather than to compute the parameters $\alpha$ and $\gamma$. So even if the performance guarantee is not as tight, once a $\delta$-approximation has been found, it provides a convenient way to compute a lower-bound on the approximation guarantee. However, being able to compute the approximation bounds does not seem relevant in practice: what truly matters is how well the greedy algorithm performs in practice, not what is the (worst case) lower bound provided by the theory. \paragraph{Experiment 5.2} It is not clear how $\delta$-approximate submodularity was used in Experiment 5.2: reading the text, it seems that the authors simply performed the task by using a greedy heuristic to maximize the minimum eigenvalue of the Gramian; this objective function is not submodular. As such, the experiment does not seem to illustrate a point relevant to this paper (but simply shows that using greedy selection on this non-submodular objective performs better than random). \paragraph{Suggestions to rewrite section 3.} Following some of the above comments, I believe that Section 3 could benefit a lot from being rewritten in the following way: \begin{enumerate} \item restate Lemma 1 as follows: ``If $f$ is $\delta$-submodular, then $\gamma_f\geq \frac{1-\delta}{1+\delta}$''. It is equivalent to, but much clearer than the current statement. Also, as a more minor comment, the proof of Lemma 1 does not need to be by contradiction and a direct proof would be clearer: \begin{displaymath} 1 \leq \min_{S,T}\frac{\sum_{t\in T\setminus S} g_S(t)}{g_S(T)} \leq \frac{1+\delta}{1-\delta}\min_{S,T}\frac{\sum_{t\in T\setminus S} f_S(t)}{f_S(T)} = \gamma_f\cdot\frac{1+\delta}{1-\delta} \end{displaymath} where the first inequality is by submodularity and the rest is by definition. \item extract the second half of the proof of Theorem 3 in the appendix as the following Lemma 2: ``let $f$ be monotone and $g$ be monotone submodular such that $d(f, g)\leq \delta)$, then $\alpha_G(f)\leq 1 - \frac{1-\delta}{1+\delta}\big(1-\alpha_T(g)\big)$''. \item present Theorem 2 with the simplified proof using the improved notion of greedy curvature. \item present Theorem 3 as a corollary of Theorem 2: by combining Lemma 1 and Lemma 2 from the first two points: if $d(f, g)\leq \delta$, then we have a lower bound on the submodularity ratio and an upper bound on the curvature than we can plug in Theorem 2. \item remove all mentions of the ROS: instead the whole benefit of this notion can be summarized by the following remark which can go below Theorem 3: ``The bound of Theorem 3 shows that if the same function $f$ can be $\delta$-approximated by multiple submodular functions, then one should favor the one with the smallest curvature''. \end{enumerate} \paragraph{Minor comments on the writing.} There are many statements of the form ``$f$ can be represented as/by a $\delta$-approximate submodular function'', but the definition of $\delta$-approximate submodular function alreadys contains the fact that $f$ is represented by a function, so I think it would be cleaner to say ``$f$ is $\delta$-approximate submodular'' (or make the representation explicit when needed and say ``let $g$ be a submodular function such that $d(f, g) \leq \delta$''). Furthermore I noticed a non-trivial amount of grammatical mistakes as I was reading and I would suggest doing more proofreading passes. \paragraph{Conclusion.} While I believe that the notion of $\delta$-submodularity is an interesting complement to the already existing notion of submodularity ratio, I think the paper would benefit a lot from a significant rework: both a rewrite of the Section 3 as suggested above, but also of the experiment section to fix the afore-mentioned problems. I also believe that the contributions of this paper, while valuable are a bit too incremental (in particular, compared to Bian et al, 2017) to justify a publication in JMLR: the suggested rewrite of Section 3 makes it apparent that Theorem 3 is a corollary of Theorem 2 and does not provide a tighter bound. \end{document}