summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorThibaut Horel <thibaut.horel@gmail.com>2020-08-31 17:34:19 -0400
committerThibaut Horel <thibaut.horel@gmail.com>2020-08-31 17:34:19 -0400
commit5dd3081df869759d077c7efee92b86388683fa69 (patch)
tree66eb429514fee07258701526d9cac1efc2158fb2
parentf3234dc87d46644234b6114b93fd1e92eddf289f (diff)
downloadreviews-5dd3081df869759d077c7efee92b86388683fa69.tar.gz
Add JMLR 20 review
-rw-r--r--Makefile16
-rw-r--r--jmlr-20-208.bib10
-rw-r--r--jmlr-20-208.tex182
3 files changed, 208 insertions, 0 deletions
diff --git a/Makefile b/Makefile
new file mode 100644
index 0000000..3082316
--- /dev/null
+++ b/Makefile
@@ -0,0 +1,16 @@
+PAPER = subliminal
+TEX = $(wildcard *.tex)
+BIB = refs.bib
+
+.PHONY: all clean FORCE
+
+all: jmlr-20-208.pdf
+
+%.pdf: FORCE
+ latexrun -W no-xcolor $*.tex
+
+clean:
+ latexrun --clean-all
+
+watch:
+ watchman-make -p '**/*.tex' '*.bib' -t all
diff --git a/jmlr-20-208.bib b/jmlr-20-208.bib
new file mode 100644
index 0000000..fbd0dc9
--- /dev/null
+++ b/jmlr-20-208.bib
@@ -0,0 +1,10 @@
+@article{A18,
+ author = {Emmanuel Abbe},
+ title = {Community Detection and Stochastic Block Models: Recent Developments},
+ journal = {Journal of Machine Learning Research},
+ year = {2018},
+ volume = {18},
+ number = {177},
+ pages = {1-86},
+ url = {http://jmlr.org/papers/v18/16-480.html}
+}
diff --git a/jmlr-20-208.tex b/jmlr-20-208.tex
new file mode 100644
index 0000000..b7d09d9
--- /dev/null
+++ b/jmlr-20-208.tex
@@ -0,0 +1,182 @@
+\documentclass[11pt]{article}
+\usepackage[T1]{fontenc}
+\usepackage[utf8]{inputenc}
+\usepackage[hmargin=1.2in, vmargin=1.1in]{geometry}
+
+\title{Review of \emph{When Less is More: Systematic Analysis of Cascade-based Community Detection}}
+\date{}
+\begin{document}
+
+\maketitle
+
+\vspace{-3em}
+
+\paragraph{Summary.} This paper considers the problem of recovering unknown
+communities in a given graph from observations of ``cascades''---that is,
+samples from a diffusion process propagating along the edges of the graph.
+Specifically, it is assumed that only the identities and ``infection'' times of
+vertices are observed, over multiple cascades, and the goal is to infer the
+community memberships of the vertices. Four algorithms are described, one is
+based on a likelihood approach, and the remaining three are based on
+a clustering approach. The algorithms are then evaluated and compared on
+a Twitter dataset (including both real community assignments and real
+cascades), and on a collection of real networks on which synthetic cascades
+were generated.
+
+\paragraph{General comments.} The problem considered in this paper is natural and an
+interesting twist on two standard problems: the network inference problem
+(recovering edges from observations of cascades), and the community detection
+problem (recovering community assignments from edges). By going directly from
+cascades to community assignments, as is considered in this paper, one could
+hope to have better recovery guarantees, than trying to infer the edges of the
+graph first and then applying a standard community detection algorithm.
+
+The main contribution of the paper seems to be the experimental evaluation,
+since all discussed algorithms seem to be either already present in
+a previous paper by the same authors, or simple adaptations of known
+algorithms. The main takeaway is that among the considered algorithms, the
+\textsc{Clique} algorithm seems to perform consistently well (either being the
+best of among the bests).
+
+\paragraph{Major points.} Although the paper claims to be agnostic to specific
+probabilistic models for the graph structure and cascades, I believe it is
+important to understand how the proposed methodology would perform if
+specific models were assumed. In particular, I have the following concerns with
+the models described in Section 3.2 and how the rest of the paper relates to
+them:
+\begin{enumerate}
+ \item With the exception of Section 3.2.3, no assumption is made on how the
+ community assignments affect the observations since only the cascade
+ model is described (there is no discussion of the graph generative
+ model). For all we know, the observations could be independent from the
+ community assignments making the learning problem considered in this
+ paper ill-defined.
+
+ Even assuming that the graph structure is generated from a model making
+ use of the community assignments (for example the Stochastic Block
+ Model), it is very important to emphasize that Sections 3.2.2 and 3.2.3
+ consider very different learning regimes:
+ \begin{itemize}
+ \item In Section 3.2.3, each cascade is generated directly from the
+ community assignments. In this situation, there could be hope
+ to design an asymptotically consistent estimator of the
+ community assignments when the number of cascades goes to
+ infinity.
+
+ \item In Section 3.2.3, the graph is only generated once (possibly
+ from the community assignments, although this is not specified)
+ and then the cascades are generated solely based on the graph
+ structure. So as the number of cascades goes to infinity, the
+ best one could hope for is to have an asymptotically consistent
+ estimator of the graph edges. However, \emph{this might still
+ not be sufficient to recover the communities} (recall that in
+ the standard community detection problem, consistency in
+ community inference is only achieved as the size of the graph
+ grows to infinity). This would make the learning task
+ considered in the paper unsolvable, even in the limit of
+ infinitely many cascades and requires a discussion.
+ \end{itemize}
+
+ Because of this crucial difference, I also believe that the claim made
+ in Section 4.2 that the C-SI-BD model is equivalent to the SI-BD is
+ wrong (it would be true only if the graph was generated anew for each
+ cascade).
+
+ \item Section 4.2: although the approach is based on maximum likelihood
+ estimation, I find the exposition in this section lacking in several
+ respects. Algorithm
+ 1 seems to be inspired by the idea of alternated maximization but stops
+ after a single alternation. At the very least, experimental evidence
+ to justify this early stopping is necessary (are the gains
+ insignificant with additional iterations?).
+
+ More generally, I believe that a likelihood-based approach could be
+ made to fit very naturally in the Expectation-maximization framework
+ (with the graph structure being the latent variable), and was
+ surprised to see no mention of it in Section 4.2 (was there
+ a specific reason why it was not considered?).
+
+ \item Section 4.3: although the approaches described in this section
+ claim to be agnostic to specific generative models, they clearly
+ seem motivated by probabilistic reasoning. In particular, Algorithm
+ 2 is evidently solving a maximum likelihood problem, and equation
+ (6) is implicitly justified as computing the posterior
+ probability of the edge being present conditioned on the observed
+ cascades. I believe a more principled description of these
+ approaches, and of the generative models under which they can be
+ formally justified is required.
+
+ \item Related to item 1., it is unclear in the experimental section whether
+ the ground truth communities in the studied datasets are in fact
+ correlated with the observations, and if they are, to which extent (see
+ also my comment below about the paper is lacking an explanation of what
+ constitutes a community in the Twitter dataset). I think this concern
+ could be addressed both by some exploratory analysis showing that the
+ ground truth communities indeed correlate with edge density and by
+ a purely synthetic experiment experiment in which the graph is
+ generated from the community assignments (for example using the SBM).
+
+ \item It is claimed that the reason why some method behave poorly is
+ because they are based on specific probabilistic modeling assumptions.
+ Could experiments be conducted to show these methods indeed perform
+ better when the assumptions are satisfied?
+\end{enumerate}
+
+\paragraph{Specific points.} Some additional concerns, which are more localized
+manifestations of the aforedescribed concerns.
+\begin{enumerate}
+ \item Section 4.2, Step (2): it is not clear what is meant by ``it is
+ sufficient to numerically find only the optimal $\delta$''. For which
+ value of $\alpha_{in}$ and $\alpha_{out}$ is $\delta$ solved for
+ numerically? If the solver is directly optimizing over $\alpha_{in}$
+ and $\delta$, then equation (4) is unnecessary, $\alpha_{out}$ is
+ simply equal to $\alpha_{in}/(\delta+1)$.
+
+ \item Section 4.2, Step (3): since the authors adapted the Louvain
+ algorithm, it would be useful to have a pseudocode of the exact
+ adaptation they used, since the given description is too informal to
+ know exactly what was done.
+
+ \item Section 4.2, Step (3): the Louvain algorithm is only a heuristic, so
+ the proposed method does not solve exactly the optimization problem
+ described in Algorithm 1, Step (3). It would be useful to have
+ experimental results showing how far from optimal the algorithm is, and
+ in particular, how the adaptation chosen by the authors defers from the
+ original algorithm.
+
+ \item Section 4.3: all the algorithms in this section use the Louvain
+ clustering algorithm. It is not clear whether the original algorithm or
+ the (loosely specified) adaptation from Section 4.2 was used.
+ A justification for this choice (whether or not to use the adaptation)
+ would also be helpful.
+
+ \item Section 5.1: the description of the Twitter dataset does not explain
+ what the communities are in Twitter terms (as a Twitter user, I was not
+ aware that there are community-like entities on Twitter).
+
+ \item Section 5.3: I find the use of the term ``unbiased'' problematic
+ since the paper has no notion of ground truth distribution, or
+ generative model (see also previous comments). In particular, the
+ authors claim the NMI criterion favors smaller communities, but maybe
+ this is not a problem if the generative model also produces smaller
+ communities?
+
+ \item Section 5.3: Is there any reason why the ``agreement''
+ metric commonly used in community detection was not considered? (see
+ for example Definition 3 in \cite{A18}).
+\end{enumerate}
+
+\paragraph{Conclusion.} Although the problem considered in this paper is
+interesting and worthy of attention, I believe that both the methodology and
+exposition suffer from important gaps. Justifying the proposed methods under
+appropriate probabilistic assumptions and then evaluating them in situations
+where the assumptions hold or do not hold would provide useful insight on the
+applicability of the methods. In its current form, it is very difficult to
+understand why the methods perform the way they do in the experiments and to
+attach a level of confidence to the main takeaway message.
+
+
+\bibliographystyle{alpha}
+\bibliography{jmlr-20-208}
+
+\end{document}