summaryrefslogtreecommitdiffstats
path: root/nips-2016-2230.txt
diff options
context:
space:
mode:
authorThibaut Horel <thibaut.horel@gmail.com>2016-07-18 00:39:35 -0400
committerThibaut Horel <thibaut.horel@gmail.com>2016-07-18 00:39:35 -0400
commit5550e00c788050053a01cae9d5ab40219ff03a67 (patch)
tree70101d13a01c2f70e5b8235d6b9eb7ca6bef8530 /nips-2016-2230.txt
parent072b6f30503223c130d9cd6679d11cf74b3bf003 (diff)
downloadreviews-5550e00c788050053a01cae9d5ab40219ff03a67.tar.gz
NIPS 2016 reviews (only one good paper)
Diffstat (limited to 'nips-2016-2230.txt')
-rw-r--r--nips-2016-2230.txt56
1 files changed, 56 insertions, 0 deletions
diff --git a/nips-2016-2230.txt b/nips-2016-2230.txt
new file mode 100644
index 0000000..ea73164
--- /dev/null
+++ b/nips-2016-2230.txt
@@ -0,0 +1,56 @@
+Summary
+=======
+
+In this paper, the authors study the streaming submodular cover problem: given
+a monotone submodular function f and a target utility Q, the goal is to find
+a set S of minimal size such that f(S) is at least Q. The setting is the one of
+streaming algorithms where the goal is to perform a small (constant) number of
+passes over the data stream, and use sublinear memory (i.e the algorithm cannot
+simply store the entire stream of data).
+
+The first result is an impossibility result: any non-trivial approximation
+guarantee (better than m/2 where m is the size of the stream) requires linear
+memory.
+
+This first result motivates the formulation of a bi-criteria formulation of the
+streaming submodular cover problem: find a set of size at most delta times
+the optimal set which gets a fraction (1-epsilon) of the optimal solution. The
+authors provide an algorithm which provably obtains this, with delta
+= 1/epsilon.
+
+Comments
+========
+
+The paper is well-written and the results are correct as far as I checked (I
+didn't read the proofs in the appendix but checked the ones in the main body of
+the paper).
+
+The experiments are interesting and illustrates well both the scalability of
+the approach and the trade-off introduced by the parameter alpha in the
+algorithm.
+
+My main concern is a conceptual interrogation regarding the computational
+model: I think it is hard to formulate a sound computation model for streaming
+optimization of arbitrary submodular functions. Indeed, since these functions
+can have exponential size representations in the general case, the authors
+assume access to a value oracle (as is standard in the offline setting). It is
+not clear to me what such oracles imply in the streaming setting: for example,
+the algorithm could cheat and use value queries to avoid having to store items.
+The exact interplay between the value queries and the memory limitation
+introduced by the streaming model should be properly discussed.
+
+For specific examples (like submodular set cover which was previously studied
+in the streaming setting, or the ones discussed in the experiments), this
+problem does not exist since the function has an explicit representation which
+can be computed as the strem of elements is received, but I think the general
+case requires a more in-depth discussion of the computational model.
+
+Finally, while the lower-bound is novel and technically interesting, the
+algorithm and its analysis seem to be an incremental variation on a well-known
+technique: it has already been observed in other settings (mapreduce submodular
+optimization for example) that the following simple threshold-based algorithm
+gives constant approximation:
+
+for decreasing values of the threshold (divide by a constant at each iteration)
+include in the solutions all elements whose marginal contribution is above the
+threshold)