From 124e5e7e441956dad1ceb9dc8987808b8f522d62 Mon Sep 17 00:00:00 2001 From: Thibaut Horel Date: Sat, 16 May 2015 17:32:37 +0200 Subject: First set of reviews --- kdd-2015-706.txt | 69 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 69 insertions(+) create mode 100644 kdd-2015-706.txt (limited to 'kdd-2015-706.txt') diff --git a/kdd-2015-706.txt b/kdd-2015-706.txt new file mode 100644 index 0000000..4d115b1 --- /dev/null +++ b/kdd-2015-706.txt @@ -0,0 +1,69 @@ +1.What is your confidence in rating this paper? + +Medium - knowledgeable on this topic + +2.How would you rate the novelty of the problem solved in this paper? + +A minor variation of some well studied problems + +3.How would you rate the technical ideas and development in this paper? + +The technical development is incremental without fundamental contributions + +4.How would you rate the empirical study conducted in this paper? + +Acceptable, but there is room for improvement + +5.Repeatability: are the data sets used publicly available (and thus the experiments may be repeated by a third party)?( Required, Visible To Authors Only After Decision Notification ) + +Yes + +6.How would you rate the quality of presentation? + +Well-written but has a significant number of typos and/or grammatical errors. Would need significant editing before publication. + +7.What is your overall recommendation? + +Weak reject. I vote for rejecting it, although would not be upset if it were accepted. + +8.Should the Best Paper Committee consider this paper for an award? + +No + +9.List up to 3 particular strengths of the paper. If none, just say "none". + +* the distinction between edge level feedback and node level feedback is an important one and it is good that it was pointed out by the authors. +* the application of Zinkevitch's result on online optimization showing that learning the influence probabilities can be done in an online manner. +* the experiment showing that strategic exploration leads to a faster learning of the influence probabilities. + +10.List up to 3 particular weaknesses of this paper. If none, just say "none". + +* the bound in Theorem 2 is stated in terms of the likelihood. In contrast to Theorem 1, it is not clear how one would combine this result with existing guarantees on the regret to obtain guarantees in the specific case where the maximum likelihood approach is used. +* no formal guarantee is given for the strategic exploration heuristic +* no guarantees on the regret are given and it is not obvious how to adapt the results from Chen et al. to obtain them under the node level feedback assumption + +11.Detailed comments for the authors; justification of your overall rating. + +This paper analyzes the problem of influence maximization in the Independent Cascade model when the influence probabilities are unknown. The authors frame the problem using the CMAB paradigm where at each time step the experimenter chooses a set of nodes to seed, observes the spread of influence for this seed choice and update her knowledge about the unknown parameters. Two goals are studied: minimizing the regret (difference to the approximately optimal influence reachable if the parameters were known) and simply learning the unknown parameters as quickly as possible. + +The application of the CMAB paradigm to influence maximization was already noted in [8, 9]. The key difference from [8, 9] is that the authors make the more realistic assumption that the experimenter does not observe which edges were active or dead during a cascade. Only the time at which nodes became active is known, which is both more realistic and more akin to the standard assumptions in the graph inference literature [10, 16, 18, 27]. + +In order to apply the CMAB paradigm under this assumption, the authors suggest two ways to update the estimates of the edges weights using the observed times at which nodes became active. In the first one, an edge is chosen at random among the ingoing edges of a node which just turned active and is considered the "correct" active edge. The authors give a bound on how the estimates derived from this method differ from the estimates that would have been obtained had the experimenter observed the active edges directly. In the second approach, the authors use the maximum likelihood approach of [27]to learn the weights from the observations. The apply the famous online optimization result from Zinkevitch [33] to give a bound on the learning rate when the updates to the estimates are done in an online manner. + +The "node level feedback" assumption seems to be the right one, but I have some concerns about the results in Section 3 in relation to Section 5. Since (close to optimal) bounds on the regrets are known under the "edge level feedback" assumption, one goal of this paper should be to obtain clear bounds on the regret under the "node level feedback" assumption. One way to obtain those bounds would be to show how the estimates obtained from node level feedback differ from the estimates obtained from edge level feedback. In this respect, Theorem 1 is satisfying, even though it is not shown how to use it to obtain a bound on the regret. Theorem 2 is a direct application of Zinkevitch's result and while it is interesting in itself, it is really not obvious how to use it to obtain bounds on the regret: the quality of the estimate is compared to the true value and not the estimate obtained from edge level feedback. Furthermore, the objective function is the log likelihood which doesn't seem to play well with the kind of computations needed when computing the regret. Also, it would have been nice to give a closed-form formula for the gradient of L to get a better sense of the computational implications of the MLE approach. Finally, I think the bound G on the gradient of L used in Theorem 2 could be expressed in terms of other parameters (p_min, p_max, etc) and would make the Theorem more interesting than directly applying the result from Zinkevitch's paper. + +The section on network exploration is slightly confusing: in this section, the goal is simply to learn the parameters as efficiently as possible. The way this section is written (randomized vs strategic exploration) is simply a rephrasing of two classical learning problems: online learning when iid samples are given to the observer, and online active learning when the observer can choose which data points to observe. As such, the result on the "random exploration strategy" is simply a rephrasing of a result on the learning rate with idd samples. In some sense, this result is weaker than results in the graph inference literature since in this paper, the graph is assumed to be known, which is usually not the case in network inference. The strategic exploration heuristic is interesting, but unfortunately, no formal guarantees are provided. + +The experiment section seemed good to me overall. My only concern is the choice of the influence probabilities for the ground truth: only the weighted cascade model was tested. It would be interesting to know whether or not the results are robust to other models. Also the fact that on average, nodes which become active had 1.175 active parents at the previous time step undermines the importance of the node level feedback assumption (if nodes only have one active parent, then edge level feedback is equivalent to node level feedback): maybe some experiments with higher probabilities (to force more active parents) would be interesting? + +12.List of typos, grammatical errors and/or concrete suggestions to improve presentation. + +Section 3, Online MLE Learning: "at each round we pick a seed set at random, ..." this is actually not required for the result to hold (as explained below Theorem 2). I think it would be clearer to keep the bit about choosing a random seed set for section 4.1. + +Section 4: + +* as stated above I think it would be clearer to write it in terms of iid samples versus active learning rather than random vs strategic exploration. + +* algorithm 1 uses UPDATE which is only defined on the next page. At this point it is really not clear what M is, which was confusing to me. + + -- cgit v1.2.3-70-g09d2