We consider a graph ${\cal G}= (V, E, \Theta)$, where $\Theta$ is a $|V|\times |V|$ matrix of paremeters describing the edge weights of $\mathcal{G}$. We define $m\defeq |V|$. A \emph{cascade model} is a Markov process over the finite state space $\{0, 1, \dots, K\}^V$. An \emph{influence cascade} is a realisation of the random process described by a cascade model. In practice, we restrict ourselves to discrete-time homogeneous cascade models. At $t=0$, each node in the graph has a constant probability $p_{init}$ of being in the ``active state''. The active nodes at $t=0$ are called the source nodes. Our probabilistic model for the source nodes is more realistic and less restrictive than the ``single source'' assumption made in \cite{Daneshmand:2014} and \cite{Abrahao:13}. \subsection{Generalized Linear Cascade Models} Denoting by $X^t$ the state of the cascade at time step $t$, we interpret $X^t_j = 1$ for node $j\in V$ as ``node $j$ is active'', \emph{i.e}, ``$j$ exhibits the source nodes' behavior at time step $t$''. We draw inspiration from \emph{generalized linear models} (GLM) to define a generalized linear cascade. \begin{definition} \label{def:glcm} Let us denote by $\{\mathcal{F}_t, t\in\ints\}$ the natural filtration induced by $\{X_t, t\in\ints\}$. A \emph{generalized linear cascade} is characterized by the following equation: \begin{displaymath} \P[X^{t+1}=x\,|\, \mathcal{F}_t] = \prod_{i=1}^m f(\inprod{\theta_i}{X^{t}})^{x_i} \big(1-f(\inprod{\theta_i}{X^{t}}\big)^{1-x_i} \end{displaymath} where $f:\mathbb{R}\to[0,1]$ and $\theta_i$ is the $i$-th column of $\Theta$. \end{definition} It follows immediately from this definition that a generalized linear cascade satisfies the Markov property: \begin{displaymath} \P[X^{t+1}=x\,|\mathcal{F}_t] = \P[X^{t+1}=x\,|\, X^t] \end{displaymath} Note also that $\E[X^{t+1}_i\,|\,X^t] = f(\inprod{\theta_i}{X^t})$. As such, $f$ can be interpreted as the inverse link function of our generalized linear cascade model. \subsection{Examples} \label{subsec:examples} In this section, we show that the well-known Independent Cascade Model and the Voter model are Generalized Linear Cascades. The Linear Threshold model will be discussed in Section~\ref{sec:linear_threshold}. \subsubsection{Independent Cascade Model} In the independent cascade model, nodes can be either susceptible, active or inactive. At $t=0$, all source nodes are ``active'' and all remaining nodes are ``susceptible''. At each time step $t$, for each edge $(i,j)$ where $j$ is susceptible and $i$ is active, $i$ attempts to infect $j$ with probability $p_{i,j}\in[0,1]$, the infection attempts are independent of each other. If $i$ succeeds, $j$ will become active at time step $t+1$. Regardless of $i$'s success, node $i$ will be inactive at time $t+1$. In other words, nodes stay active for only one time step. The cascade process terminates when no active nodes remain. If we denote by $X^t$ the indicator variable of the set of active nodes at time step $t$, then if $j$ is susceptible at time step $t+1$, we have: \begin{displaymath} \P\big[X^{t+1}_j = 1\,|\, X^{t}\big] = 1 - \prod_{i = 1}^m (1 - p_{i,j})^{X^t_i}. \end{displaymath} Defining $\Theta_{i,j} \defeq \log(1-p_{i,j})$, this can be rewritten as: \begin{equation}\label{eq:ic} \tag{IC} \P\big[X^{t+1}_j = 1\,|\, X^{t}\big] = 1 - \prod_{i = 1}^m e^{\Theta_{i,j}X^t_i} = 1 - e^{\inprod{\theta_j}{X^t}} \end{equation} which is a Generalized Linear Cascade model with inverse link function $f(z) = 1 - e^z$. \subsubsection{The Voter Model} In the Voter Model, nodes can be either red or blue, where ``blue'' is the source state. The parameters of the graph are normalized such that $\forall i, \ \sum_j \Theta_{i,j} = 1$. Each round, every node $j$ independently chooses one of its neighbors with probability $\Theta_ij$ and adopts their color. The cascades stops at a fixed horizon time T or if all nodes are of the same color. If we denote by $X^t$ the indicator variable of the set of blue nodes at time step $t$, then we have: \begin{equation} \mathbb{P}\left[X^{t+1}_j = 1 | X^t \right] = \sum_{i=1}^m \Theta_{i,j} X_i^t = \inprod{\theta_j}{X^t} \tag{V} \end{equation} which is a Generalized Linear Cascade model with inverse link function $f(z) = z$. \subsection{Maximum Likelihood Estimation} Recovering the model parameter $\Theta$ from observed influence cascades is the central question of the present work. Recovering the edges in $E$ from observed influence cascades is a well-identified problem known as the \emph{Graph Inference} problem. However, recovering the influence parameters is no less important and has been seemingly overlooked so far. In this work we focus on recovering $\Theta$, noting that the set of edges $E$ can then be recovered through the following equivalence: \begin{displaymath} (i,j)\in E\Leftrightarrow \Theta_{i,j} \neq 0 \end{displaymath} Given observations $(x^1,\ldots,x^n)$ of a cascade model, we can recover $\Theta$ via Maximum Likelihood Estimation (MLE). Denoting by $\mathcal{L}$ the log-likelihood function, we consider the following $\ell_1$-regularized MLE problem: \begin{displaymath} \hat{\Theta} \in \argmax_{\Theta} \mathcal{L}(\Theta\,|\,x^1,\ldots,x^n) + \lambda\|\Theta\|_1 \end{displaymath} where $\lambda$ is the regularization factor which helps at preventing overfitting as well as controlling for the sparsity of the solution. The generalized linear cascade model is decomposable in the following sense: given Definition~\ref{def:glcm}, the log-likelihood can be written as the sum of $m$ terms, each term $i\in\{1,\ldots,m\}$ only depending on $\theta_i$. Since this is equally true for $\|\Theta\|_1$, each column $\theta_i$ of $\Theta$ can be estimated by a separate optimization program: \begin{equation}\label{eq:pre-mle} \hat{\theta}_i \in \argmax_{\theta}\frac{1}{n_i}\mathcal{L}_i(\theta_i\,|\,x^1,\ldots,x^n) + \lambda\|\theta_i\|_1 \end{equation} where we denote by $n_i$ the first step at which node $i$ becomes active and where: \begin{multline} \mathcal{L}_i(\theta_i\,|\,x^1,\ldots,x^n) = \frac{1}{n_i} \sum_{t=1}^{n_i-1}\log\big(1-f(\inprod{\theta_i}{x^t})\big)\\ +\log f(\inprod{\theta_i}{x^{n_i}})+ \lambda\|\theta_i\|_1 \end{multline} TODO: discuss conditions on $f$ under which this program is convex. For LC and VM it is convex.