diff options
Diffstat (limited to 'finale/sections')
| -rw-r--r-- | finale/sections/active.tex | 9 | ||||
| -rw-r--r-- | finale/sections/bayesian.tex | 9 | ||||
| -rw-r--r-- | finale/sections/experiments.tex | 7 | ||||
| -rw-r--r-- | finale/sections/intro.tex | 11 | ||||
| -rw-r--r-- | finale/sections/model.tex | 68 | ||||
| -rw-r--r-- | finale/sections/related.tex | 41 | ||||
| -rw-r--r-- | finale/sections/relatex.tex | 2 |
7 files changed, 147 insertions, 0 deletions
diff --git a/finale/sections/active.tex b/finale/sections/active.tex new file mode 100644 index 0000000..84cf4c6 --- /dev/null +++ b/finale/sections/active.tex @@ -0,0 +1,9 @@ +intro (motivation, description) + +mutual information approach + +two approx: + +- first step truncation + +- variance in lieu of MI diff --git a/finale/sections/bayesian.tex b/finale/sections/bayesian.tex new file mode 100644 index 0000000..04ba7b3 --- /dev/null +++ b/finale/sections/bayesian.tex @@ -0,0 +1,9 @@ +advantages, disadvantages + +graphical model and description + +MCMC + +variational inference + +Bohning diff --git a/finale/sections/experiments.tex b/finale/sections/experiments.tex new file mode 100644 index 0000000..c9cf762 --- /dev/null +++ b/finale/sections/experiments.tex @@ -0,0 +1,7 @@ +implementation: PyMC (scalability), blocks + +baseline + +graphs/datasets + +bullshit diff --git a/finale/sections/intro.tex b/finale/sections/intro.tex new file mode 100644 index 0000000..56f31b7 --- /dev/null +++ b/finale/sections/intro.tex @@ -0,0 +1,11 @@ +\begin{itemize} + \item graph inference: what is the proble? what is an observation, + contagion model + \item prior work: sample complexity with MLE + \item here: bayesian approach + \begin{itemize} + \item natural framework for active learning wwith significant + speedup over passive + \end{itemize} +\end{itemize} +\input{sections/related.tex} diff --git a/finale/sections/model.tex b/finale/sections/model.tex new file mode 100644 index 0000000..cb8b699 --- /dev/null +++ b/finale/sections/model.tex @@ -0,0 +1,68 @@ +The GLC model is described over a directed graph $G = (V, \Theta)$. Denoting by +$k=|V|$ the number of nodes in the graph, $\Theta\in\R_{+}^{k\times k}$ is the +matrix of edge weights. Note that $\Theta$ implicitly defines the edge set $E$ of +the graph through the following equivalence: +\begin{displaymath} + (u,v)\in E\Leftrightarrow \Theta_{u,v} > 0,\quad + (u,v)\in V^2 +\end{displaymath} + +The time is discretized and indexed by a variable $t\in\N$. The nodes can be in +one of three states: \emph{susceptible}, \emph{infected} or \emph{immune}. +Let us denote by $S_t$ the set of nodes susceptible at the beginning of time +step $t\in\N$ and by $I_t$ the set of nodes who become infected at this time +step. The following dynamics: +\begin{displaymath} + S_0 = V,\quad S_{t+1} = S_t \setminus I_t +\end{displaymath} +expresses that the nodes infected at a time step are no longer susceptible +starting from the next time step (they become part of the immune nodes). + +The dynamics of $I_t$ are described by a random Markovian process. Let us +denote by $x_t$ the indicator vector of $I_t$ at time step $t\in\N$. $x_0$ is +drawn from a \emph{source distribution} $p_s:\{0,1\}^n\to[0,1]$. For $t\geq 1$, +we have: +\begin{equation} + \label{eq:markov} + \forall i\in S_t,\quad + \P\big(x_{i}^{t} = 1\,|\, x^{t-1}\big) = f(\bt_i\cdot x^{t-1}) +\end{equation} +where $\bt_i$ is the $i$th column of $\Theta$. The function $f:\R\to[0,1]$ can +be interpreted as the inverse link function of the model. Finally, the +transitions in \cref{eq:markov} occur independently for each $i$. A cascade +continues until no infected nodes remains. + +We refer the reader to \cite{pouget} for a more complete description of the +model and examples of common contagion models which can be interpreted as +specific instances of the GLC model. + +It follows from Section 2, that a source distribution $p_s$ and +\cref{eq:markov} together completely specify the distribution $p$ of a cascade +$\mathbf{x} = (x_t)_{t\geq 0}$: +\begin{equation} + \label{eq:dist} + \mathcal{L}_{\Theta}(\bx) + = p_s(x^0)\prod_{\substack{t\geq 1 \\ i\in S_t}} + f(\bt_i\cdot x^{t-1})^{x^t_i}\big(1-f(\theta_i\cdot x^{t-1})\big)^{1-x_i^t} +\end{equation} + +\paragraph{MLE estimation.} +It follows from the form of \cref{eq:dist} that the log-likelihood of $\Theta$ +given cascade data can be written as the sum of $k$ terms, each term $i$ only +depending on $\bt_i$. Hence, each $\bt_i$ can be learnt separately by solving +a node-specific optimization problem. Specifically, for a given node $i$, if we +concatenate together all the time steps $t$ where this node was susceptible and +write $y^t = x^{t+1}_i$, \emph{i.e} whether or not the node became infected at +the next time step, the MLE estimator for $\bt_i$ is obtained by solving the +following optimization problem: +\begin{equation} + \label{eq:mle} + \hat{\theta}\in \argmax_\theta \sum_{t} y^t\log f(\theta\cdot x^t) + + (1-y^t) \log \big(1 - f(\theta\cdot x^t)\big) +\end{equation} +It is interesting to note that at the node-level, doing MLE inference for the +GLC model is exactly amounts to fitting a Generalized Linear Model. When $f$ is +log-concave as is the case in most examples of GLC models, then the above +optimization problem becomes a convex optimization problem which can be solved +exactly and efficiently. The code to perform MLE estimation can be found in the +appendix, file \textsf{mle.py}. diff --git a/finale/sections/related.tex b/finale/sections/related.tex new file mode 100644 index 0000000..59fb759 --- /dev/null +++ b/finale/sections/related.tex @@ -0,0 +1,41 @@ +\paragraph{Related works.} +The study of edge prediction in graphs has been an active field of research for +over a decade~\cite{Nowell08, Leskovec07, AdarA05}.~\cite{GomezRodriguez:2010} +introduced the {\scshape Netinf} algorithm, which approximates the likelihood of +cascades represented as a continuous process. The algorithm, relying on +Maximum-Likelihood and convex relaxations was improved in later +work~\cite{gomezbalduzzi:2011}, but is not known to have any theoretical +guarantees beside empirical validation on synthetic networks. + +Let us denote by $m$ the number of nodes in the graph and for a given node, we +denote by $s$ its in-degree. In what follows, we call a \emph{recovery +guarantee} an upper bound on the number of observations required (expressed as +a function of $s$ and $m$) to learn the incoming edges of a node to a fixed +arbitrary accuracy level. \cite{Netrapalli:2012} studied the discrete-time +version of the independent cascade model and obtained the first ${\cal O}(s^2 +\log m)$ recovery guarantee on general networks, by using unregularized MLE and +making a {\it correlation decay\/} assumption, which limits the number of new +infections at every step. They also suggest a {\scshape Greedy} algorithm, +with provably good performance in tree graphs, which maintains a counter of +each node's antecedents, i.e. the set of nodes infected at a prior time step to +its infection time. The work of~\cite{Abrahao:13} studies the same +continuous-model framework as \cite{GomezRodriguez:2010} and obtains an ${\cal +O}(s^9 \log^2 s \log m)$ support recovery algorithm, without the +\emph{correlation decay} assumption. \cite{du2013uncover,Daneshmand:2014} +propose Lasso regularization of MLE for recovering the weights of the graph +under a continuous-time independent cascade model. The work +of~\cite{du2014influence} is slightly orthogonal to ours since they suggest +learning the \emph{influence} function, rather than the parameters of the +network directly. + +More recently, the continuous process studied in previous work has been +reformulated as a Hawkes process, with recent papers +\cite{linderman2014discovering, linderman2015scalable, simma2012modeling}, +focusing on Expectation-Maximization, and Variational Inference methods to learn +these processes. Because of the discrete nature of the GLC model, we hope to +bring a better understanding to the link between the properties of a graph and +its \emph{learnability}. We wish to further explore the idea of non-product +priors raised in \cite{linderman2014discovering, linderman2015scalable}, since +the experimental validation of their work focused on simple graph priors. +Finally, the \emph{Active Learning} formulation is, to the best of the +authors' knowledge, novel in this context. diff --git a/finale/sections/relatex.tex b/finale/sections/relatex.tex new file mode 100644 index 0000000..366ae04 --- /dev/null +++ b/finale/sections/relatex.tex @@ -0,0 +1,2 @@ +pomme + |
