\paragraph{Influence Cascades} We consider a weighted graph $G$ over the set of vertices $V$ and we define $m\defeq |V|$. The edge weights are given by an (unknown) weight matrix $\Theta:V^2\to\reals^+$. $\Theta_{i,j} = 0$ simply expresses that there is no edge from $i$ to $j$ For $j\in V$, we write $\theta_j$ the $j$-th column of $\Theta$: this is the indicator vector of the parents of $j$ in $G$. An \emph{influence cascade} is a homogeneous discrete-time Markov process over $\{0,1\}^m$. More precisely, we represent the set of nodes infected at time step $t-1\in\ints^+$ by a random vector $X^{t}\in\{0,1\}^{m}$ where $X^{t}_k = 1$ can be interpreted as \emph{node $k$ was infected at time step $t-1$}. \paragraph{Problem} In what follows, we assume that we know how to express the transition probabilities as a function of the weight matrix $\Theta$ and the goal is to infer this matrix by observing influence cascades. We use standard maximum likelihood estimation (MLE). However, we assume that the parameter we wish to recover, namely the matrix $\Theta$, is sparse in the following sense: each node in the graph $G$ has at most $s$ parents. In other words, the columns of $G$ have at most $s$ non-zero entries. To factor this additional assumption in the maximum likelihood estimation we use the now classic $\ell_1$ regularization. For an observation $(x_1,\ldots,x_n)$, we estimate $\hat{\Theta}$ by: \begin{equation}\label{eq:mle} \hat{\Theta} = \argmax_{\Theta}\mathcal{L}(\Theta\,|\, x^1,\ldots, x^n) + \lambda\|\Theta\|_1 \end{equation} where $\mathcal{L}$ is the log-likelihood function and $\lambda$ is the regularization factor. The goal of this paper is to characterize $\hat{\Theta}$ in terms of: \begin{itemize} \item \emph{efficiency}: how well does $\hat{\Theta}$ approximate the true matrix $\Theta$? In particular can we bound its accuracy as a function of the number of observations? \item \emph{model consistency}: assuming that the true matrix $\Theta$ is $s$-sparse as defined above, can we recover the support of $\Theta$ from the estimator $\hat{\Theta}$, and if so, how many observations are required? \end{itemize} We now turn to two widely used influence cascade models, the Independent Cascade Model (IC) and the Linear Threshold Model (LT) and specify the MLE problem \eqref{eq:mle} for these models. \subsection{Independent Cascade Model} In the independent cascade model, nodes can be either uninfected, active or infected. All nodes start either as uninfected or active. At each time step $t$, for each edge $(i,j)$ where $j$ is uninfected and $i$ is active, $j$ attempts to infect $j$ with probability $p_{i,j}$. If $i$ succeeds, $j$ will become active at time step $t+1$. Regardless of $i$'s success, node $i$ will be infected at time $t+1$: nodes stay active for only one time step. The cascade continues until no active nodes remain. If we denote by $X^t$ the indicator variable of the set of active nodes at time step $t-1$, then if $j$ is uninfected at time step $t-1$, we have: \begin{displaymath} \P\big[X^{t+1}_j = 1\,|\, X^{t}\big] = 1 - \prod_{i = 1}^m (1 - p_{i,j})^{X^t_i} = 1 - e^{\inprod{\theta_j}{X^t}} \end{displaymath} where we defined $\Theta_{i,j} \defeq \log(1-p_{i,j})$. \subsection{Linear Threshold Model} In the Linear Threshold Model, each node $j\in V$ has a threshold $t_j$ drawn uniformly from the interval $[0,1]$. We assume that the weights given by $\Theta$ are such that for each node $j\in V$, $\sum_{i=1}^m \Theta_{i,j} \leq 1$. Nodes can be either infected or uninfected. At each time step, each uninfected node $j$ becomes infected if the sum of the weights $\Theta_{i,j}$ where $i$ is an infected parent of $j$ is larger than $j$'s threshold $t_j$. Denoting by $X^t$ the indicator variable of the set of infected nodes at time step $t-1$, if $j\in V$ is uninfected at time step $t-1$, then: \begin{displaymath} \P\big[X^{t+1}_j = 1\,|\, X^{t}\big] = \sum_{i=1}^m \Theta_{i,j}X^t_i = \inprod{\theta_j}{X^t} \end{displaymath} \subsection{Unification} $X_j^{t+1}$ is a Bernoulli variable whose parameter can be written: a function $f$ of $\inprod{\theta_j}{x^t}$, where $\inprod{a}{b}$ is the inner product of $a$ and $b$ in $\reals^m$: $\theta_j$ and $x^t$: \begin{equation}\label{eq:markov} \P\big[x_j^{t+1} = 1\,|\, x^t\big] \defeq f(\inprod{\theta_j}{x^t}). \end{equation}