aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
-rw-r--r--notes/cascades.pngbin73864 -> 0 bytes
-rw-r--r--notes/images/icc.png (renamed from notes/icc.png)bin130144 -> 130144 bytes
-rw-r--r--notes/images/voter.png (renamed from notes/voter.png)bin111444 -> 111444 bytes
-rw-r--r--notes/reportYaron.tex28
-rw-r--r--src/make_plots.py4
5 files changed, 30 insertions, 2 deletions
diff --git a/notes/cascades.png b/notes/cascades.png
deleted file mode 100644
index 9f14d95..0000000
--- a/notes/cascades.png
+++ /dev/null
Binary files differ
diff --git a/notes/icc.png b/notes/images/icc.png
index 2148eed..2148eed 100644
--- a/notes/icc.png
+++ b/notes/images/icc.png
Binary files differ
diff --git a/notes/voter.png b/notes/images/voter.png
index c1325cb..c1325cb 100644
--- a/notes/voter.png
+++ b/notes/images/voter.png
Binary files differ
diff --git a/notes/reportYaron.tex b/notes/reportYaron.tex
index d39d060..6341f98 100644
--- a/notes/reportYaron.tex
+++ b/notes/reportYaron.tex
@@ -173,6 +173,34 @@ In general, the smaller $\delta_k$ is, the better we can recover $k$-sparse vect
\subsection{Verifying the RIP constants}
+Though it is not yet clear which normalization to choose for the columns of matrix $M$, nor whether we should include only the first step of every cascade or the cascade in its entirety, we ran simple experiments to show that the measurement matrix $M$ has `well-behaved' RIP constants.
+
+\begin{table}
+\centering
+\begin{tabular}{c | c | c | c }
+c = 100 & c = 1000 & c = 100 & c = 1000 \\
+$p_\text{init}$ = .1 & $p_\text{init}$ = .1 & $p_\text{init}$ = .05 & $p_\text{init}$ = .05 \\
+\hline
+$\delta_4 = .54$ & $\delta_4 = .37$ & $\delta_4 = .43$ & $\delta_4 = .23$ \\
+\end{tabular}
+\caption{Results on dataset of 22 nodes extracted from the Karate club network. $p_\text{init}$ is the probability that a node is part of the seed set. $c$ is the number of cascades. $\delta_4$ is the RIP constant for $4$-sparse vectors. Cascades are generated using the independent cascade model.}
+\end{table}
+
+Finding a non-trivial RIP constant is strongly NP-hard, and as such we were unable to test on networks with more than 25 nodes. To compute the $k-th$ RIP constant ($\delta_k$ with our notation), we must extract $k$ columns of $M$, and compute the following programs on the resulting matrices $(M_k)$ (there are $\binom{n}{k}$ of them!):
+\begin{align}
+\label{eq:lower_bound}
+\min \qquad & \| M_k \vec x \|_2 \\
+\| \vec x\|_2 = 1 & \nonumber
+\end{align}
+\begin{align}
+\label{eq:upper_bound}
+\max \qquad & \| M_k \vec x \|_2 \\
+\| \vec x\|_2 = 1 & \nonumber
+\end{align}
+
+Note how choosing the minimum of Eq~\ref{eq:lower_bound} over all extracted matrices returns the lowest possible $1 - \delta^-_k$ and choosing the maximum of Eq~\ref{eq:upper_bound} over all extracted matrices returns the highest possible $1 + \delta^+_k$. The smallest admissible $k-$RIP constant is therefore: $\max(\delta^+_k, \delta^-_k)$. Equations \ref{eq:lower_bound} and \ref{eq:upper_bound} can be solved efficiently since they correspond to the smallest and greatest eigenvalues of $M^TM$ respectively.
+
+In \cite{candes},
\subsection{Testing our algorithm}
diff --git a/src/make_plots.py b/src/make_plots.py
index 9b1fdf4..5a61c10 100644
--- a/src/make_plots.py
+++ b/src/make_plots.py
@@ -39,8 +39,8 @@ def compare_greedy_and_lagrange_cs284r():
for the CS284r project
"""
G = cascade_creation.InfluenceGraph(max_proba = .8)
- G.import_from_file("subset_facebook_SNAP.txt")
- A = cascade_creation.generate_cascades(G, p_init=.05, n_cascades=100)
+ G.import_from_file("../datasets/subset_facebook_SNAPnormalize.txt")
+ A = cascade_creation.generate_cascades(G, p_init=.05, n_cascades=1000)
#Greedy
G_hat = algorithms.greedy_prediction(G, A)