1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
|
\documentclass[10pt]{article}
\usepackage[utf8x]{inputenc}
\usepackage{amsmath,amsfonts,amsthm}
\usepackage[english]{babel}
\usepackage[capitalize, noabbrev]{cleveref}
\usepackage{algorithm}
\usepackage{algpseudocode}
\newenvironment{enumerate*}%
{\vspace{-2ex} \begin{enumerate} %
\setlength{\itemsep}{-1ex} \setlength{\parsep}{0pt}}%
{\end{enumerate}}
\newenvironment{itemize*}%
{\vspace{-2ex} \begin{itemize} %
\setlength{\itemsep}{-1ex} \setlength{\parsep}{0pt}}%
{\end{itemize}}
\newenvironment{description*}%
{\vspace{-2ex} \begin{description} %
\setlength{\itemsep}{-1ex} \setlength{\parsep}{0pt}}%
{\end{description}}
\DeclareMathOperator{\E}{\mathbb{E}}
\let\P\relax
\DeclareMathOperator{\P}{\mathbb{P}}
\newcommand{\ex}[1]{\E\left[#1\right]}
\newcommand{\prob}[1]{\P\left[#1\right]}
\newcommand{\inprod}[1]{\left\langle #1 \right\rangle}
\newcommand{\R}{\mathbf{R}}
\newcommand{\N}{\mathbf{N}}
\newcommand{\C}{\mathcal{C}}
\newcommand{\eps}{\varepsilon}
\newcommand{\eqdef}{\mathbin{\stackrel{\rm def}{=}}}
\newcommand{\llbracket}{[\![}
\newtheorem{theorem}{Theorem}
\newtheorem{lemma}{Lemma}
\algrenewcommand\algorithmicensure{\textbf{Output:}}
\algrenewcommand\algorithmicrequire{\textbf{Input:}}
\author{}
\title{}
\date{}
\begin{document}
\section{Probabilistic Model}
During the last decade, a growing body of work has focused on the graph
inference problem: how can an unknown graph be recovered from the observation
of cascades propagating along it? (CITE). While our probabilistic model for
cascades of violence strongly relies on this line of work, we depart from it in
the following ways:
\begin{itemize}
\item in our case, the graph along which the cascades of violence propagate
(the co-offending network) is known, and we are only interested in
recovering (and ultimately predicting) the edges of the cascade, that
is, identifying who influenced whom.
\item the previous models treat multiple cascades as independent
observations which occur sequentially. In our case, cascades can die
and spawn at all time and temporally overlap. This need to be carefully
accounted for in our generative model and the resulting inference
algorithms.
\end{itemize}
\subsection{Pairwise Propagation}
For a pair $(u,v)$ of nodes, we model the propagation of the cascade
from $u$ to $v$ as the sequence of the following two probabilistic events:
\begin{enumerate}
\item \emph{structural propagation:} a Bernouilli variable dictates whether
or not a propagation occurs. The parameter of this Bernouilli variable
is computed from the edge weights of the co-offending network as
follows:
\begin{equation}\label{eq:structural}
p_{u,v}^s \eqdef \prod_{e\in P(u,v)}
\frac{1}{1+e^{-\frac{w_e}{\lambda}}}
\end{equation}
where $P(u,v)$ denotes the shortest path from $u$ to $v$ and $\lambda$
is the \emph{structural parameter}. That is, the weight $w_e$ of each
edge $e$ on the path from $u$ to $v$ is mapped to a probability
$p_e\in [0,1]$ through the logistic function:
\begin{displaymath}
p_e \eqdef \frac{1}{1+e^{-\frac{w_e}{\lambda}}}
\end{displaymath}
Note that the probability of structural propagation both \emph{(a)}
decreases as the distance between $u$ and $v$ in the co-offending
network increases, and \emph{(b)} increases as the weights of the edges
on the path from $u$ to $v$ increase. The structural parameter
$\lambda$ controls how quickly the probability $p_e$ of a given edge
converges to one as the weight $w_e$ increases.
\item \emph{temporal propagation:} if the structural propagation occurs,
then the propagation time $\Delta_{u,v}$ between $u$ and $v$ is drawn
from a continuous distribution with support in $\R_+$. We denote by
$f(t)$ the density function of this probability distribution and by
$F(t)$ its cumulative function. Table TODO gives the expression of $f$
and $F$ for the different temporal distributions considered in this
work.
In our data, the infection times are only observed with a temporal
resolution of one day. As a consequence, for two nodes $u$ and $v$
whose \empb{observed} infection times are respectively $t_u$ and $t_v$
(days), the \emph{real} propagation time could be anything in the
interval $(t_v-t_u-1, t_v-t_u]$. Hence, the probability of observing
a propagation time $\Delta_{u,v} = t_v - t_u$ is given by:
\begin{equation}\label{eq:temporal}
p_{u,v}^t \eqdef \int_{\Delta_{u,v}-1}^{\Delta_{u,v}} f(t)dt
\end{equation}
\paragraph{Successful propagation.} For two nodes $u$ and $v$ with observed
infection times $t_u$ and $t_v$, the probability $p_{u,v}$ that $u$ infected
$v$ is obtained by multiplying the structural and temporal probabilities given
by \eqref{eq:structural} and \eqref{eq:temporal}:
\begin{displaymath}
p_{u,v} \eqdef p_{u,v}^s\cdot p_{u,v}^t
\end{displaymath}
\subsection{Spawning of New Cascades}
\subsection{Cascade Model}
\section{Maximum Likelihood Inference}
\subsection{Finding the Cascades}
\subsection{Finding the Spawning Parameter}
\subsection{Finding the Pairwise Propagation Parameters}
\end{document}
|