In this paper, the authors consider a variant of the Influence Maximization problem in which: * the network's nodes can be in one of three states: "inactive", "aware" and "adopt" (as opposed to two states in many previous information spread models). Aware nodes can propagate the infection even if they do not end up adopting the product. The transitions from inactive to aware, and from aware to adopt are modeled similarly to the Linear Threshold Model. * the network's nodes have attributes from which their propensity to adopt the product being marketed can be computed (via a linear relationship). The transition thresholds from the aware to the adopt state depend on this propensity. As in the original Influence Maximization Problem, the goal is to select a seed set (of size at most k) such as to maximize the total number of nodes in the "adopt" state when the epidemic stop propagating. The authors propose a heuristic based on an "influence potential" score and discounting to account for possible overlaps between the neighborhoods of the selected nodes. The heuristic is evaluated experimentally. Model ===== The distinction between aware and adopt is interesting: it captures the "five stages of adaption" [Beal and Bohlen 1957] more realistically than simpler models and could also be applied to epidemiology where an individual might carry a disease without being infected. However, the following inaccuracies and mis-specifications undermine the strength of the model: 1. The transition from inactive to aware is controlled by a threshold drawn uniformly at random in the interval [0, 1]. As a consequence, given a seed set A, the number of "adopted" nodes when the cascade stops propagating is a random variable. It is not clear whether or not \sigma(A) in the objective function (equation (1)) is this number (the random variable) or its expectation. Even though it is customary to consider the expectation (see for example [Kempe et al. 2003]), other parts of the paper seem to consider the random variable instead. See comment 8. below. 2. Dividing equation (4) both sides by d_v, we see that awareness is modeled exactly as in the Linear Threshold Model with uniform probabilities. There is no justification to why the edge weights (w(x, v)) are not used at this step, as is done in equation (5). 3. The distinction between target nodes and non-target nodes seem unnecessary: nodes with a cumulative profit equal to 0 will never transition to the "adopt" state as can be seen in equation (5) and thus will never increase the influence spread score. Thus, the same model describes the transitions of both target and non-target nodes. Furthermore, the term "target node" is misguiding since it leads the reader to believe that the campaign will actively target them, whereas they passively follow the propagation dynamics governed by equation (4) and (5). 4. It should be clearly specified that the model is discrete time: at each time step nodes transition from one state to the other simultaneously according to equation (4) and (5). 5. It is not clear why usefulness is a number in the interval [0, 1]. Normalization at this step seems unnecessary since the cumulative profits are then normalized to the range [0, 1] (by dividing by the max value). 6. 3.2.2 (ii) heterogeneous influence probabilities: this paragraph seems out of scope: it is important for the model to have a weighted graph. The fact that (depending on the applications) the weights could be computed from additional information does not appear to be relevant for the model. Furthermore, the experiments at the end do not make use of this. 7. Similarly, the relevance of section 3.2.1 (labels and usefulness) is questionable. The only thing that the model truly requires is threshold to put on the right-hand side of equation (5): the "propensity" to adopt the product. The fact that this propensity could be computed from information available about the nodes is application-dependent and would be better located in works focusing on learning the propensity score from the nodes' attributes. Overall, the model would benefit from being specified more rigorously and reduced to its "core", leaving the questions of modeling and learning its parameters from node attributes to future works. Algorithm ========= The suggested heuristic is intuitive and seems well-suited to the proposed TAM model, as is confirmed by the experiments. However: 8. Algorithm 2, line 25: "v becomes newly adopted" is somewhat ill-defined: as explained in comment 1. above, the set of "adopt" nodes when the cascade stops propagating is random. This line seems to indicate that this is not the case, and that the thresholds for awareness transitions are fixed once and for all. This is inconsistent with the experiments, where it seems that the authors revert back to considering the expectation (computed by sampling and averaging). 9. Complexity analysis: the running time of computing the test on line 25 in Algorithm 2 does not appear in the time complexity. A naive algorithm would take linear time every time line 25 is executed; could it be done more efficiently? 10. Complexity analysis: using a max-heap Fibonacci heap (which seems required since we need to repeatedly call ExtractMax), DecreaseKey takes O(log N) times (*IncreaseKey* would take O(1) time) and ExtractMax takes O(log N) time, yielding k(log N + d^2 log N) for the second term. 11. Complexity analysis: the simplification when d << N only seems valid when d^2 = o(log N). Stating this explicitly would be more accurate than d << N. Experiments =========== It is interesting to see that the suggested heuristic works better than other generic methods, which justifies the effort of designing a method tailored to the TAM model. However, I have the following concerns about the methodology used for the experiments: 12. "optimal value" of alpha. It is not said which criterion is used to define optimality. At the bottom of page 14, it is mentioned that some values of alpha overestimate or underestimate the influence spread. This suggests the existence of a ground truth to calibrate the alpha parameter, but no mention of this ground truth can be found. 13. as mentioned in comments 1. and 8. above, computing the influence spreads by averaging over 1000 iterations would be consistent with using the expected number of "adopt" nodes in the objective function. However, this appears to be inconsistent with Algorithm 2. 14. It would have been interesting to analyze how quickly the average of multiple iterations converges to a stable value. I would expect it to depend on the size of the network. However, a constant value of 1000 is used throughout the enter experimental section. 15. Table 2: comparison of Linear Treshold and TAM models is confusing. Why compare the same heuristic (designed for TAM) on two different models? Is obtaining larger numbers better? If this is the case, choosing a model where nodes become aware and adopt with probability 1 would produce even larger numbers. Style ===== The paper is well-structured, making the high-level progression easy to follow. However, the paper suffers locally from: * many typos, which could be fixed by a serious proofreading pass. * a few sentences whose construction render the intended meaning hard to grasp. The most extreme examples are listed below: - page 2, line 29 to 31 - page 5, line 49 - page 17, line 41 to 43 * the use of parenthesis in equation (4) and (5) to denote a multiplication, which is unconventional and somewhat confusing. The paper appears seems to have been put together very hastily. Even if all the comments I have made above were to be addressed, I wouldn't be convinced that the contributions of the paper (a variant of already existing models and heuristic evaluated experimentally) are significant enough to justify an acceptance in TEAC.