diff options
Diffstat (limited to 'paper/sections/model.tex')
| -rw-r--r-- | paper/sections/model.tex | 86 |
1 files changed, 47 insertions, 39 deletions
diff --git a/paper/sections/model.tex b/paper/sections/model.tex index 3c0a354..11de7d2 100644 --- a/paper/sections/model.tex +++ b/paper/sections/model.tex @@ -1,52 +1,60 @@ -\subsection{Influence Cascades} +\paragraph{Influence Cascades} -We consider a weighted graph $G$ over the set of vertices $V$. The edge weights -are given by an (unknown) weight matrix $\Theta:V^2\to\reals^+$. $\Theta_{i,j} -= 0$ simply expresses that there is no edge from $i$ to $j$. We define $m=|V|$, -and for $j\in V$, we write $\theta_j$ the $j$-th column of $\Theta$: this is -the indicator vector of the parents of $j$ in $G$. +We consider a weighted graph $G$ over the set of vertices $V$ and we define +$m\defeq |V|$. The edge weights are given by an (unknown) weight matrix +$\Theta:V^2\to\reals^+$. $\Theta_{i,j} = 0$ simply expresses that there is no +edge from $i$ to $j$ For $j\in V$, we write $\theta_j$ the $j$-th column of +$\Theta$: this is the indicator vector of the parents of $j$ in $G$. An \emph{influence cascade} is a homogeneous discrete-time Markov process over $\{0,1\}^m$. More precisely, we represent the set of nodes infected at time -step $t-1\in\ints^+$ by a vector $x^{t}\in\{0,1\}^{m}$ where $x^{t}_k = 1$ can -be interpreted as \emph{node $k$ was infected at time step $t-1$}. +step $t-1\in\ints^+$ by a random vector $X^{t}\in\{0,1\}^{m}$ where $X^{t}_k += 1$ can be interpreted as \emph{node $k$ was infected at time step $t-1$}. -We will restrict ourselves to influence processes such that for $j\in V$, -$x_j^{t+1}$ is a Bernouilli variable whose parameter can be expressed as -a function $f$ of $\inprod{\theta_j}{x^t}$, where $\inprod{a}{b}$ is the inner -product of $a$ and $b$ in $\reals^m$: -$\theta_j$ and $x^t$: -\begin{equation}\label{eq:markov} - \P\big[x_j^{t+1} = 1\,|\, x^t\big] \defeq f(\inprod{\theta_j}{x^t}). -\end{equation} -As we will later see, this type of influence processes encompasses the most -widely used influence processes: the Voter model (V), the Independent Cascade -model (IC), and the Linear Threshold model (T). +\paragraph{Problem} In what follows, we assume that we know how to express the +transition probabilities as a function of the weight matrix $\Theta$ and the +goal is to infer this matrix by observing influence cascades. -\subsection{Problem} - -Our goal is to infer the weight matrix $\Theta$ by observing influence -cascades. Given the form of \eqref{eq:markov}, we will focus on a single row -$\theta_j$ of $\Theta$ and we will write $y^t_j \defeq x_j^{t+1}$. When there -is no ambiguity, we will simply write $y^t = y^t_j$ and $\theta= \theta_j$. +We use standard maximum likelihood estimation (MLE). However, we assume that +the parameter we wish to recover, namely the matrix $\Theta$, is sparse in the +following sense: each node in the graph $G$ has at most $s$ parents. In other +words, the columns of $G$ have at most $s$ non-zero entries. To factor this +additional assumption in the maximum likelihood estimation we use the now +classic $\ell_1$ regularization. For an observation $(x_1,\ldots,x_n)$, we +estimate $\hat{\Theta}$ by: +\begin{equation}\label{eq:mle} + \hat{\Theta} = \argmax_{\Theta}\mathcal{L}(\Theta\,|\, x^1,\ldots, x^n) + + \lambda\|\Theta\|_1 +\end{equation} +where $\mathcal{L}$ is the log-likelihood function and $\lambda$ is the +regularization factor. -The input of our problem is a set of observations $(x^1, y^1),\ldots,(x^n, -y^n)$. If we observe more than one cascade, we simply concatenate the -observations from each cascade to form a single longer cascade. -Equation~\eqref{eq:markov} still holds for each observation. +The goal of this paper is to characterize $\hat{\Theta}$ in terms of: +\begin{itemize} + \item \emph{efficiency}: how well does $\hat{\Theta}$ approximate the true + matrix $\Theta$? In particular can we bound its accuracy as a function + of the number of observations? + \item \emph{model consistency}: assuming that the true matrix $\Theta$ is + $s$-sparse as defined above, can we recover the support of $\Theta$ + from the estimator $\hat{\Theta}$, and if so, how many observations are + required? +\end{itemize} -With these notations, the set of observations follows a Generalized Linear -Model with Bernoulli distribution and link function $g$ such that $g^{-1} -= f$. The log-likelihood of $\theta$ can be written: -\begin{displaymath} - \mathcal{L}(\theta\,|\, x, y) = \sum_{t=1}^ny^t \log f(\inprod{\theta}{x^t}) - + (1-y^t)\log \big(1-f(\inprod{\theta}{x^t})\big) -\end{displaymath} +We now turn to two widely used influence cascade models, the Independent +Cascade Model (IC) and the Linear Threshold Model (LT) and specify the MLE +problem \eqref{eq:mle} for these models -\paragraph{Sparsity} +\subsection{Linear Threshold Model} +\subsection{Independent Cascade Model} -\subsection{Voter Model} +\subsection{Unification} -\subsection{Independent Cascade Model} +$X_j^{t+1}$ is a Bernoulli variable whose parameter can be written: +a function $f$ of $\inprod{\theta_j}{x^t}$, where $\inprod{a}{b}$ is the inner +product of $a$ and $b$ in $\reals^m$: +$\theta_j$ and $x^t$: +\begin{equation}\label{eq:markov} + \P\big[x_j^{t+1} = 1\,|\, x^t\big] \defeq f(\inprod{\theta_j}{x^t}). +\end{equation} |
