From 98fb1b781dd89357fa36f7835d96749dcae4da4f Mon Sep 17 00:00:00 2001
From: Thibaut Horel <thibaut.horel@gmail.com>
Date: Tue, 15 Mar 2016 17:05:57 -0400
Subject: Add ICML 2016 review

---
 icml-2016-1111.xml | 130 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 130 insertions(+)
 create mode 100644 icml-2016-1111.xml
diff --git a/icml-2016-1111.xml b/icml-2016-1111.xml
new file mode 100644
index 0000000..879d86b
--- /dev/null
+++ b/icml-2016-1111.xml
@@ -0,0 +1,130 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<conference ShortName="ICML2016" ReviewDeadline="Please see Conference Website">
+  <paper PaperId="454" Title="Scaling Submodular Maximization on Pruned Submodularity Graph">
+    <question Number="1" Text="Summary of the paper (Summarize the main claims/contributions of the paper.)">
+        <YourAnswer>
+This paper studies the problem of optimizing an approximately convex function,
+that is, one which is within additive approximation delta of a convex
+function). For a given accuracy epsilon, the goal is to obtain a solution whose
+value is within an additive epsilon of the optimal value in time polynomial in
+the dimension d and 1/epsilon. The paper considers zero-order optimization, in
+which the function is only accessed through value queries (for example, it is
+not assumed that the gradient can be computed; it might not even exist since
+the approximately convex function could even be discontinuous)
+
+It is intuitively clear that as delta grows larger compared to epsilon, the
+problem becomes harder. More precisely, the goal is to find a threshold
+function T, such that when delta = O(T(epsilon)), then the problem is solvable
+in polynomial time and when delta = Omega(T(epsilon)), the problem is not
+solvable in polynomial time (the problem is always solvable in exponential time
+by considering a square grid of width 1/epsilon).
+
+The authors show that this function is T(epsilon) = max(epsilon^2/sqrt{d},
+epsilon/d). More precisely:
+
+* they provide an information theoretic lower bound, showing that when delta
+= Omega(T(epsilon)) (up to logarithmic factor), no algorithm making
+polynomially evaluations of the function can optimize it to precision epsilon.
+The lower bound relies on the careful construction of a family of functions
+defined on the unit ball which behaves like ||x||^{1+alpha} unless x lies in
+a small angle around a random chosen direction. In this small angle, the
+function can take significantly smaller values, but with very high probability,
+an algorithm never evaluates the function in this small angle.
+
+* they give an algorithm which provably finds an epsilon-approximate solution
+in the regime where delta = Omega(epsilon/d) and delta = O(epsilon^2/d).
+Together with a previous algorithm from Belloni et al. in the regime delta
+= O(epsilon/d), this completes the algorithmic upper bound. Their algorithm
+uses a natural idea from Flaxman et al., where the gradient of the underlying
+convex function at some point x is estimated by sampling points in a ball
+around x. The algorithm is then a gradient descent using this estimated
+gradient. The analysis relies on showing that even with a delta-approximately
+convex function, this way of estimating the gradient still provides
+a sufficiently good descent direction.
+        </YourAnswer>
+    </question>
+    <question Number="2" Text="Clarity (Assess the clarity of the presentation and reproducibility of the results.)">
+      <PossibleAnswer>Excellent (Easy to follow)</PossibleAnswer>
+      <PossibleAnswer>Above Average</PossibleAnswer>
+      <PossibleAnswer>Below Average</PossibleAnswer>
+      <PossibleAnswer>Poor (Hard to follow)</PossibleAnswer>
+      <YourAnswer>Above Average</YourAnswer>
+    </question>
+    <question Number="3" Text="Clarity - Justification">
+        <YourAnswer>
+The paper is very clearly written and the overall structure is easy to follow.
+A lot of care was given in precisely stating the propositions and the theorems
+so that the dependencies of the bounds in all the parameters can easily be
+tracked.
+
+There are two places where the paper could have benefited from more
+explanations:
+
+* Construction 4.1, it is not clear why \tilde{h} should be replaced by the
+lower convex envelope of \tilde{h}, especially since \tilde{h} itself is
+already convex.
+
+* Beginning of the proof of Lemma 5.1, the argument why the curvature of the
+boundary of K can be assumed finite wlog is not immediate to me.
+        </YourAnswer>
+    </question>
+    <question Number="4" Text="Significance (Does the paper contribute a major breakthrough or an incremental advance?)">
+      <PossibleAnswer>Excellent (substantial, novel contribution)</PossibleAnswer>
+      <PossibleAnswer>Above Average</PossibleAnswer>
+      <PossibleAnswer>Below Average</PossibleAnswer>
+      <PossibleAnswer>Poor (minimal or no contribution)</PossibleAnswer>
+      <YourAnswer>Excellent (substantial, novel contribution)</YourAnswer>
+    </question>
+    <question Number="5" Text="Significance - Justification">
+        <YourAnswer>
+This paper makes a significant contribution to the question of zero-order
+optimization of approximately convex function by essentially closing the gap
+between information-theoretic lower bounds and algorithmic upper bounds.
+
+What was previously known:
+
+* a tight epsilon/sqrt{d} threshold in the case where the underlying
+convex function is smooth (the upper bound coming from Dyer et al. and the
+lower bound coming from Singer et al.)
+
+* an epsilon/d algorithmic upper bound for general (non-smooth) functions from
+Belloni et al.
+
+The construction of the information theoretic lower bound is novel and
+non-trivial. While the algorithm is inspired by Flaxman et al., its analysis
+for approximately convex functions is novel. 
+        </YourAnswer>
+    </question>
+    <question Number="6" Text="Detailed comments. (Explain the basis for your ratings while providing constructive feedback.)">
+        <YourAnswer>
+As already stated, the paper makes a substantial and novel contribution to the
+field of zero-order optimization of approximately convex functions (which, as
+the authors point out, had very few theoretical guarantees until recently).
+
+As far as I was able to verify the results are correct. I think my comments
+about clarity should be addressed for the camera-ready version, but I do not
+believe they affect the validity of the results. Overall, I stronly support
+this paper.
+
+Typo on line 297: I think it should read \tilde{f}(x) = f(x) otherwise (instead
+of \tilde{f}(x) = x)
+        </YourAnswer>
+    </question>
+    <question Number="7" Text="Overall Rating">
+      <PossibleAnswer>Strong accept</PossibleAnswer>
+      <PossibleAnswer>Weak accept</PossibleAnswer>
+      <PossibleAnswer>Weak reject</PossibleAnswer>
+      <PossibleAnswer>Strong reject</PossibleAnswer>
+      <YourAnswer>Strong accept</YourAnswer>
+    </question>
+    <question Number="8" Text="Reviewer confidence">
+      <PossibleAnswer>Reviewer is an expert</PossibleAnswer>
+      <PossibleAnswer>Reviewer is knowledgeable</PossibleAnswer>
+      <PossibleAnswer>Reviewer's evaluation is an educated guess</PossibleAnswer>
+      <YourAnswer>Reviewer is knowledgeable</YourAnswer>
+    </question>
+    <question Number="9" Text="Confidential Comments (not visible to authors)">
+      <YourAnswer></YourAnswer>
+    </question>
+  </paper>
+</conference>
-- 
cgit v1.3.1