aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorThibaut Horel <thibaut.horel@gmail.com>2015-02-06 16:25:28 -0500
committerThibaut Horel <thibaut.horel@gmail.com>2015-02-06 16:25:28 -0500
commit8a611d4cc0e979d62ee29a75d757bfd2b3ff0cdb (patch)
treee6452b9fe9f55ec201545686d91f02c5b4a63787
parent98a6ae52fbebd18b1f19aab9d41198043441b3d8 (diff)
downloadcascades-8a611d4cc0e979d62ee29a75d757bfd2b3ff0cdb.tar.gz
Cosmetic changes
-rw-r--r--paper/paper.tex4
-rw-r--r--paper/sections/experiments.tex6
-rw-r--r--paper/sections/results.tex13
3 files changed, 12 insertions, 11 deletions
diff --git a/paper/paper.tex b/paper/paper.tex
index aa293f3..96a1c58 100644
--- a/paper/paper.tex
+++ b/paper/paper.tex
@@ -108,7 +108,7 @@
\label{sec:results}
\input{sections/results.tex}
-\section{A lower bound}
+\section{A Lower Bound}
\label{sec:lowerbound}
\input{sections/lowerbound.tex}
@@ -124,7 +124,7 @@
\label{sec:experiments}
\input{sections/experiments.tex}
-\section{Discussion}
+\section{Future Work}
\label{sec:linear_threshold}
\input{sections/discussion.tex}
diff --git a/paper/sections/experiments.tex b/paper/sections/experiments.tex
index d13472e..80f95fa 100644
--- a/paper/sections/experiments.tex
+++ b/paper/sections/experiments.tex
@@ -33,7 +33,8 @@ In this section, we validate empirically the results and assumptions of Section~
\paragraph{Experimental setup}
We evaluate the performance of the algorithms on synthetic graphs, chosen for their similarity to real social networks. We therefore consider a Watts-Strogatz graph ($300$ nodes, $4500$ edges) \cite{watts:1998}, a Barabasi-Albert graph ($300$ nodes, $16200$ edges) \cite{barabasi:2001}, a Holme-Kim power law graph ($200$ nodes, $9772$ edges) \cite{Holme:2002}, and the recently introduced Kronecker graph ($256$ nodes, $10000$ edges) \cite{Leskovec:2010}. Undirected graphs are converted to directed graphs by doubling the edges.
-For every reported data point, we sample edge weights and generate $n$ cascades from the Independent Cascade model for $n \in \{100, 500, 1000, 2000, 5000\}$. We compare for each algorithm the estimated graph $\hat {\cal G}$ with ${\cal G}$. The initial probability of a node being a source is fixed to $0.05$, i.e. an average of $15$ nodes source nodes per cascades for all experiments, except for Figure~\label{fig:four_figs} (f). All edge weights are chosen uniformly in the interval $[0.2, 0.7]$, except when testing for approximately sparse graphs (see paragraph on robustness). Adjusting for the variance of our experiments, all data points are reported with at most a $\pm 1$ error margin. The parameter $\lambda$ is chosen to be of the order ${\cal O}(\sqrt{\log m / (\alpha n)})$.
+For every reported data point, we sample edge weights and generate $n$ cascades
+from the (IC) model for $n \in \{100, 500, 1000, 2000, 5000\}$. We compare for each algorithm the estimated graph $\hat {\cal G}$ with ${\cal G}$. The initial probability of a node being a source is fixed to $0.05$, i.e. an average of $15$ nodes source nodes per cascades for all experiments, except for Figure~\label{fig:four_figs} (f). All edge weights are chosen uniformly in the interval $[0.2, 0.7]$, except when testing for approximately sparse graphs (see paragraph on robustness). Adjusting for the variance of our experiments, all data points are reported with at most a $\pm 1$ error margin. The parameter $\lambda$ is chosen to be of the order ${\cal O}(\sqrt{\log m / (\alpha n)})$.
\paragraph{Benchmarks}
@@ -44,8 +45,7 @@ $$\hat \theta_i \in \arg \min_{\theta} \sum_{t \in {\cal T}} |f(\theta_i\cdot x^
We did not benchmark against other known algorithms (\textsc{netrate} \cite{gomezbalduzzi:2011} and \textsc{first edge} \cite{Abrahao:13}) due to the discrete time assumption. These algorithms also suppose a single-source model, whereas \textsc{sparse mle}, \textsc{mle}, and \textsc{greedy} do not. Learning the graph in the case of a multi-source cascade model is harder (see Figure~\ref{fig:four_figs} (f)) but more realistic, since we rarely have access to ``patient 0'' in practice.
\paragraph{Graph Estimation}
-In the case of the \textsc{lasso}, \textsc{mle} and \textsc{sparse mle} algorithms, we construct the edges of $\hat {\cal G} : \cup_{j \in V} \{i : \Theta_{ij} > 0.1\}$, \emph{i.e} by thresholding. The true positives are the edges which appear both in ${\cal G}$ and $\hat {\cal G}$ and the true negatives are the edges which appear in neither. Finally, we report the F1-score$=2 \text{precision}\cdot\text{recall}/(\text{precision}+\text{recall})$, which considers the number of true edges recovered by the algorithm over the total number of edges returned by the algorithm (\emph{precision}) with the number of true edges recovered by the algorithm over the total number of edges it should have recovered (\emph{recall}).
-
+In the case of the \textsc{lasso}, \textsc{mle} and \textsc{sparse mle} algorithms, we construct the edges of $\hat {\cal G} : \cup_{j \in V} \{i : \Theta_{ij} > 0.1\}$, \emph{i.e} by thresholding. Finally, we report the F1-score$=2 \text{precision}\cdot\text{recall}/(\text{precision}+\text{recall})$, which considers the number of true edges recovered by the algorithm over the total number of edges returned by the algorithm (\emph{precision}) with the number of true edges recovered by the algorithm over the total number of edges it should have recovered (\emph{recall}).
Over all experiments, \textsc{sparse mle} achieves higher rates of precision,
recall, and F1-score. Interestingly, both \textsc{mle} and \textsc{sparse mle} perform exceptionally well on the Watts-Strogatz graph.
\begin{comment}
diff --git a/paper/sections/results.tex b/paper/sections/results.tex
index c39f9da..5d63cbd 100644
--- a/paper/sections/results.tex
+++ b/paper/sections/results.tex
@@ -130,9 +130,10 @@ $\hat{\theta}_\lambda$ is the solution of \eqref{eq:pre-mle}:
To prove Theorem~\ref{thm:main}, we apply Lemma~\ref{lem:negahban} with
$\tau_{\mathcal{L}}=0$. Since $\mathcal{L}$ is twice differentiable and convex,
-assumption \eqref{eq:rc} is implied by the restricted eigenvalue condition
-\eqref{eq:re}. The upper bound on the $\ell_{\infty}$ norm of
-$\nabla\mathcal{L}(\theta^*)$ is given by Lemma~\ref{lem:ub}.
+assumption \eqref{eq:rc} with $\kappa_{\mathcal{L}}=\frac{\gamma}{2}$ is
+implied by the (RE) condition \eqref{eq:re}. The upper bound
+on the $\ell_{\infty}$ norm of $\nabla\mathcal{L}(\theta^*)$ is given by
+Lemma~\ref{lem:ub}.
\begin{lemma}
\label{lem:ub}
@@ -335,7 +336,8 @@ We will need the following additional assumptions on the inverse link function $
\left|\frac{f''}{f}\right|\right)
\leq\frac{1}{\alpha}
\end{equation}
-whenever $f(\inprod{\theta^*}{x})\notin\{0,1\}$.
+whenever $f(\inprod{\theta^*}{x})\notin\{0,1\}$. These conditions are once
+again non restrictive in the (IC) model and (V) model.
\begin{proposition}
\label{prop:fi}
@@ -391,8 +393,7 @@ m)$.
\paragraph{(RE) vs Irrepresentability Condition}
\cite{Daneshmand:2014} rely on an `incoherence' condition on the hessian of the
-likelihood function. Their condition is equivalent to the more commonly called
-{\it (S,s)-irrepresentability} condition:
+likelihood function also known as the {\it (S,s)-irrepresentability} condition:
\begin{comment}
\begin{definition}