diff options
Diffstat (limited to 'finale')
| -rw-r--r-- | finale/project_proposal.tex | 164 | ||||
| -rw-r--r-- | finale/sparse.bib | 507 |
2 files changed, 602 insertions, 69 deletions
diff --git a/finale/project_proposal.tex b/finale/project_proposal.tex index 5e0d21c..c8607e6 100644 --- a/finale/project_proposal.tex +++ b/finale/project_proposal.tex @@ -1,96 +1,122 @@ -\documentclass[8pt]{article} +\documentclass[10pt]{article} +\usepackage[utf8x]{inputenc} \usepackage{fullpage, amsmath, amssymb, amsthm} +\usepackage{bbm} +\DeclareMathOperator*{\argmax}{argmax} -\title{Regression Analysis with Network data} -\author{Jean Pouget-Abadie, Thibaut Horel} -\date{} +\title{Regression Analysis with Network Data} +\author{Thibaut Horel \and Jean Pouget-Abadie} \begin{document} \maketitle -\subsection*{The Network Inference problem} +\section{Background: Cascade Models and Network Inference} -The network inference problem concerns itself with learning the edges and the -edge weights of an unknown network. Each edge weight $\theta_e, e\in E$ is a -parameter to be estimated. The information at our disposal is the result of a -cascade process on the network. Here, we will focus on the Generalized Linear -Cascade (GLC) model introduced in~\cite{paper} presented below. +The Network Inference Problem concerns itself with learning the edges and the +edge weights of an unknown network on the set of vertices $V$. Formally, the +set of parameters to learn is a matrix $\theta\in\mathbb{R}^{V\times V}$ where +$\theta_{u,v}$ is the weight corresponding to the pair of vertices $(u, v)$. +The observations at our disposal for this learning task are samples drawn for +a probabilistic cascade model. In this project, we will focus on the +Generalized Linear Cascade (GLC) model introduced in~\cite{pouget} that we +briefly present below. -\paragraph{The GLC model} - -Consider a graph $\mathcal{G} = (N, E)$, where nodes $N$ can be one of three -possible states: susceptible, infected, and immune. Let $x^t$ be the indicator -variable of ``infected nodes'' at time step $t$. A \emph{generalized linear -cascade model} is a cascade model such that for each ``susceptible'' node $i$ at -time step $t$, the probability of $i$ becoming ``infected'' at time step $t+1$ -conditioned on $x^t$ is a Bernoulli variable of parameter $f(\theta_i -\cdot X^t)$: +\paragraph{GLC model.} The nodes in $V$ can be in one of three possible states: +susceptible, infected, and immune. Let $x^t$ be the indicator variable of +``infected nodes'' at time step $t$ and $S_t$ be the set of susceptible nodes +at time step $t$. A \emph{generalized linear cascade model} is a discrete-time +random process in which the conditional law of $x^{t+1}$ given $x^t$ takes the +following form: for each ``susceptible'' node $i$ at time step $t$, the +probability of $i$ becoming ``infected'' at time step $t+1$ conditioned on +$x^t$ is a Bernoulli variable of parameter $f(\theta_i \cdot x^t)$: \begin{equation} - \mathbb{P}(x^{t+1}_i = 1|x^t) = f(\theta_i \cdot x^t) + \label{eq:model} + \mathbb{P}(x^{t+1}_i = 1\,|\,x^t) = f(\theta_i \cdot x^t), + \quad i\in S_t \end{equation} -where $f: \mathbb{R} \rightarrow [0,1]$. There is no a priori rule for the -transition from and to the immune and susceptible state, but the most commonly -studied GLC model, known as the Independent Cascade model, assumes that all +for some function $f: \mathbb{R} \rightarrow [0,1]$ and where $\theta_i += (\theta_{u, i})_{u\in V}$. There is no a priori rule for the transition from +and to the immune and susceptible state, but the most commonly studied GLC +model, known as the Independent Cascade model \cite{Kempe:03}, assumes that all infected nodes at time step $t$ are immune for all time steps $t' \geq t+1$. Under this assumption, a cascade ends when no ``infected'' nodes remain. The designer is also free to decide the initial state of graph $\mathcal{G}$ at the -beginning of every cascade process. For simplicity, we will suppose that, at the -start of each cascade, one node is chosen uniformly at random to be +beginning of every cascade process. For simplicity, we will suppose that, at +the start of each cascade, one node is chosen uniformly at random to be ``infected'', and that all other nodes are in the ``susceptible'' state. -\bigskip -The network inference problem can therefore be formalized as finding the support -of the edge weight vector $\theta$ from the observation of ${(x^c_t)}_{c\in -\mathcal{C}, t\in \mathcal{T}_C}$, where $\mathcal{C}$ indexes the cascade -number, and $\mathcal{T}_C$ the time step in that cascade. Prior work has -observed that the network inference problem is decomposable node-by-node -i.e.~solving for $\beta_{i} = \{\theta_{i,j} | j \in N\}$ for node $i$. This can -easily be formulated as the following maximum likelihood program: -$$ \theta_{i} \in \arg \max_\theta \log -\mathcal{L}_i(\theta_i\,|\,x^1,\ldots,x^n) = \frac{1}{|{\cal T}_i|} \sum_{t\in -{\cal T}_i } x_i^{t+1}\log f(\theta_i\cdot x^{t}) + (1 - -x_i^{t+1})\log\big(1-f(\theta_i \cdot x^t)\big)$$ -where $\mathcal{T}_i = \{\text{node i is ``susceptible'' at time step } t\}$. -The reader will note the above problem boils down to fitting a generalized -linear model. In particular, if $f$ is the sigmoid function, we are performing +\paragraph{Inference.} A cascade is the collection of vectors $x^t$ for all +time steps until the cascade ends. Given a sample of independent cascades, the +parameters $\theta$ can be estimated using maximum likelihood estimation. Prior +work has observed that the MLE problem can be decomposed node-by-node +\emph{i.e}, $\theta_i$ can be learned independently for each node $i$ by +solving the following optimization program: +\begin{equation} + \label{eq:mle} + \hat{\theta}_{i} \in \argmax_\theta + \log \mathcal{L}_i(\theta_i) = + \sum_{t:i\in S_t} x_i^{t+1}\log f(\theta_i\cdot x^{t}) + + (1 - x_i^{t+1})\log\big(1-f(\theta_i \cdot x^t)\big) +\end{equation} +The reader will observe that the above problem reduces to fitting a generalized +linear model. It is important to note that even though the cascades are +independent, the vectors $x^t$ observed within a cascade are not independent +since they follow the Markov process described in \eqref{eq:model}. In the +particular case where $f$ is the sigmoid function, we are performing logistic regression: -$$ +\begin{displaymath} \begin{cases} -y_i^* = \theta_i \cdot x_i + \epsilon \text{~where~} \epsilon\sim -Logistic(0, 1) \text{~and~} x = {(x^t)}_{t \in \mathcal{T}_i}\\ -y_i = [y_i^* > 0] \text{~where~} y_i^t = x_i^{t+1} + y^t = \theta_i \cdot x^t + \varepsilon^t + &\text{where } \varepsilon^t\sim \text{Logistic}(0, 1)\\ + \tilde{y}^t = \mathbf{1}\{y^t > 0\} &\text{where } x_i^{t+1} = \tilde{y}^t \end{cases} -$$ +\end{displaymath} -\subsection*{Objectives} +\section{Problem Statement} +A growing series of work (see for example \cite{Netrapalli:2012, +Daneshmand:2014, pouget}) has focused on solving the Network Inference Problem +under different cascade models and obtaining estimators with close-to-optimal +convergence rates. However, these works often rely on hard-to-interpret +assumptions which often hide subtle properties of the probabilistic model +\eqref{eq:model}. The overarching goal of the current project is to analyze the +fundamental statistical properties of observations coming from model +\eqref{eq:model} and of the MLE estimator \eqref{eq:mle}. In particular we are +interested in the following questions: \begin{itemize} -\item Try a Bayesian approach to estimate these parameters. Use the posterior -predictive distribution to obtain confidence intervals for the edge parameters. -Validate this with bootstrapping. How does this perform in different networks? -Can you intuitively link certain node-level/graph-level properties with the -resulting variance on the estimated parameter? -\item Do the previous observations correspond with the theoretical result, given -by the Fisher information matrix: $$\hat \beta \sim \mathcal{N}(\beta, -I{(\theta)}^{-1})$$ where $I(\theta) = - {\left(\frac{\partial^2\log -\mathcal{L}}{\partial \theta^2} \right)}^{-1}$ +\item Is the model \eqref{eq:model} identifiable and is the MLE estimator + \eqref{eq:mle} consistent? Are there networks which are fundamentally + ambiguous and for which the Network Inference Problem is unsolvable? +\item Use a Bayesian approach to estimate the parameters. Use the posterior + predictive distribution to obtain confidence intervals for the edge + parameters. Validate this with bootstrapping. +\item Are there networks which are harder to learn than others? Can the + variance of the MLE estimator be related to certain graph theoretic + properties of the network? +\item Do the previous observations match with the theoretical result, given +by the Fisher information matrix: +\begin{displaymath} + \hat{\theta}_i \sim \mathcal{N}(\theta_i, I{(\theta_i)}^{-1}) + \quad \text{where}\; + I(\theta) = - \left(\frac{\partial^2\log \mathcal{L}_i}{\partial \theta^2} + \right)^{-1} +\end{displaymath} \item Are there networks in which the Fisher information matrix is singular? -What happens to the estimation of $\beta$ in this case? -\item What if the generative process is generated with a different link -function? Is there a regularization scheme which can mitigate any bias/exploding -variance in the estimated parameters? +What happens to the estimation of $\theta_i$ in this case? +\item Is the estimator \eqref{eq:mle} robust to the choice of $f$? What if the + observations are generated with $f_1$, but \eqref{eq:mle} is solved with + $f_2\neq f_1$? Is there a regularization scheme which can mitigate any bias + or lack of convergence in the estimated parameters? \end{itemize} -\subsection*{Program plan} - -The project will be a series of simulations to answer each of the above -questions? When possible, we will try to explain the results found in the -simulation with a simplified analysis on toy-networks. Thibaut and I have worked -together in the past, and have kept our contributions balanced. +\paragraph{Roadmap.} We plan to answer the above question by a mixture of +a theoretical-based and a simulation-based approaches. We expect some of these +questions to be hard from the theoretical standpoint. In those cases, the +hardness can be mitigated by using simulations or focusing on specific cases: +toy-networks (star graph, cycle, complete graph, etc.) or simpler cascades +models (the Voter Model for example). -\begin{thebibliography}{1} -\bibitem{paper} Pouget-Abadie, J. and Horel, T. \emph{Inferring Graphs from -Cascades: A Sparse Recovery Framework}, ICML 2015 -\end{thebibliography} +\bibliography{sparse} +\bibliographystyle{abbrv} \end{document} diff --git a/finale/sparse.bib b/finale/sparse.bib new file mode 100644 index 0000000..f50a0d2 --- /dev/null +++ b/finale/sparse.bib @@ -0,0 +1,507 @@ +@article {CandesRomberTao:2006, +author = {Candès, Emmanuel J. and Romberg, Justin K. and Tao, Terence}, +title = {Stable signal recovery from incomplete and inaccurate measurements}, +journal = {Communications on Pure and Applied Mathematics}, +volume = {59}, +number = {8}, +publisher = {Wiley Subscription Services, Inc., A Wiley Company}, +issn = {1097-0312}, +pages = {1207--1223}, +year = {2006}, +} + + +@inproceedings{GomezRodriguez:2010, + author = {Gomez Rodriguez, Manuel and Leskovec, Jure and Krause, Andreas}, + title = {Inferring Networks of Diffusion and Influence}, + booktitle = {Proceedings of the 16th ACM SIGKDD International Conference on + Knowledge Discovery and Data Mining}, + series = {KDD '10}, + year = {2010}, + isbn = {978-1-4503-0055-1}, + location = {Washington, DC, USA}, + pages = {1019--1028}, + numpages = {10}, + publisher = {ACM}, + address = {New York, NY, USA}, +} + +@inproceedings{pouget, + title={Inferring graphs from cascades: A Sparse Recovery Framework}, + author={Pouget-Abadie, Jean and Horel, Thibaut}, + booktitle = {ICML}, + year={2015} +} + +@inproceedings{du2013uncover, + title={Uncover topic-sensitive information diffusion networks}, + author={Du, Nan and Song, Le and Woo, Hyenkyun and Zha, Hongyuan}, + booktitle={Proceedings of the Sixteenth International Conference on + Artificial Intelligence and Statistics}, + pages={229--237}, + year={2013} +} + +@inproceedings{du2014influence, + title={Influence function learning in information diffusion networks}, + author={Du, Nan and Liang, Yingyu and Balcan, Maria and Song, Le}, + booktitle={Proceedings of the 31st International Conference on Machine + Learning (ICML-14)}, + pages={2016--2024}, + year={2014} +} + + +@article{Netrapalli:2012, + author = {Netrapalli, Praneeth and Sanghavi, Sujay}, + title = {Learning the Graph of Epidemic Cascades}, + journal = {SIGMETRICS Perform. Eval. Rev.}, + volume = {40}, + number = {1}, + month = {June}, + year = {2012}, + issn = {0163-5999}, + numpages = {12}, + publisher = {ACM}, + address = {New York, NY, USA}, + keywords = {cascades, epidemics, graph structure learning}, +} + +@article{Negahban:2009, + author = {Negahban, Sahand N. and Ravikumar, Pradeep and Wrainwright, Martin J. and Yu, Bin}, + title = {A Unified Framework for High-Dimensional Analysis of M-Estimators with Decomposable Regularizers}, + Journal = {Statistical Science}, + year = {2012}, + month = {December}, + volume = {27}, + number = {4}, + pages = {538--557}, +} + +@article{Zhao:2006, + author = {Zhao, Peng and Yu, Bin}, + title = {On Model Selection Consistency of Lasso}, + journal = {J. Mach. Learn. Res.}, + issue_date = {12/1/2006}, + volume = {7}, + month = dec, + year = {2006}, + issn = {1532-4435}, + pages = {2541--2563}, + numpages = {23}, + acmid = {1248637}, + publisher = {JMLR.org}, +} + +@inproceedings{Daneshmand:2014, + author = {Hadi Daneshmand and + Manuel Gomez{-}Rodriguez and + Le Song and + Bernhard Sch{\"{o}}lkopf}, + title = {Estimating Diffusion Network Structures: Recovery Conditions, Sample + Complexity {\&} Soft-thresholding Algorithm}, + booktitle = {ICML}, + year = {2014}, + timestamp = {Fri, 07 Nov 2014 20:42:30 +0100}, + bibsource = {dblp computer science bibliography, http://dblp.org} +} + +@inproceedings{Kempe:03, + author = {David Kempe and + Jon M. Kleinberg and + {\'{E}}va Tardos}, + title = {Maximizing the spread of influence through a social network}, + booktitle = {KDD}, + year = {2003}, + timestamp = {Mon, 13 Feb 2006 15:34:20 +0100}, + bibsource = {dblp computer science bibliography, http://dblp.org} +} + +@inproceedings{Abrahao:13, + author = {Bruno D. Abrahao and + Flavio Chierichetti and + Robert Kleinberg and + Alessandro Panconesi}, + title = {Trace complexity of network inference}, + booktitle = {The 19th {ACM} {SIGKDD} International Conference on Knowledge Discovery + and Data Mining, {KDD} 2013, Chicago, IL, USA, August 11-14, 2013}, + pages = {491--499}, + year = {2013}, + timestamp = {Tue, 10 Sep 2013 10:11:57 +0200}, + bibsource = {dblp computer science bibliography, http://dblp.org} +} + + +@article{vandegeer:2009, +author = "van de Geer, Sara A. and B{\"u}hlmann, Peter", +fjournal = "Electronic Journal of Statistics", +journal = "Electron. J. Statist.", +pages = "1360--1392", +publisher = "The Institute of Mathematical Statistics and the Bernoulli Society", +title = "On the conditions used to prove oracle results for the Lasso", +volume = "3", +year = "2009" +} + +@article{vandegeer:2011, +author = "van de Geer, Sara and Bühlmann, Peter and Zhou, Shuheng", +fjournal = "Electronic Journal of Statistics", +journal = "Electron. J. Statist.", +pages = "688--749", +publisher = "The Institute of Mathematical Statistics and the Bernoulli Society", +title = "The adaptive and the thresholded Lasso for potentially misspecified + models (and a lower bound for the Lasso)", +volume = "5", +year = "2011" +} + +@article{Zou:2006, +author = {Zou, Hui}, +title = {The Adaptive Lasso and Its Oracle Properties}, +journal = {Journal of the American Statistical Association}, +volume = {101}, +number = {476}, +pages = {1418-1429}, +year = {2006}, +} + +@article{Jacques:2013, + author = {Laurent Jacques and + Jason N. Laska and + Petros T. Boufounos and + Richard G. Baraniuk}, + title = {Robust 1-Bit Compressive Sensing via Binary Stable Embeddings of + Sparse Vectors}, + journal = {{IEEE} Transactions on Information Theory}, + volume = {59}, + number = {4}, + pages = {2082--2102}, + year = {2013}, + timestamp = {Tue, 09 Apr 2013 19:57:48 +0200}, + bibsource = {dblp computer science bibliography, http://dblp.org} +} + +@inproceedings{Boufounos:2008, + author = {Petros Boufounos and + Richard G. Baraniuk}, + title = {1-Bit compressive sensing}, + booktitle = {42nd Annual Conference on Information Sciences and Systems, {CISS} + 2008, Princeton, NJ, USA, 19-21 March 2008}, + pages = {16--21}, + year = {2008}, + timestamp = {Wed, 15 Oct 2014 17:04:27 +0200}, + bibsource = {dblp computer science bibliography, http://dblp.org} +} + +@inproceedings{Gupta:2010, + author = {Ankit Gupta and + Robert Nowak and + Benjamin Recht}, + title = {Sample complexity for 1-bit compressed sensing and sparse + classification}, + booktitle = {{IEEE} International Symposium on Information Theory, {ISIT} 2010, + June 13-18, 2010, Austin, Texas, USA, Proceedings}, + pages = {1553--1557}, + year = {2010}, + timestamp = {Thu, 15 Jan 2015 17:11:50 +0100}, + bibsource = {dblp computer science bibliography, http://dblp.org} +} + +@article{Plan:2014, + author = {Yaniv Plan and + Roman Vershynin}, + title = {Dimension Reduction by Random Hyperplane Tessellations}, + journal = {Discrete {\&} Computational Geometry}, + volume = {51}, + number = {2}, + pages = {438--461}, + year = {2014}, + timestamp = {Tue, 11 Feb 2014 13:48:56 +0100}, + bibsource = {dblp computer science bibliography, http://dblp.org} +} + +@article{bickel:2009, +author = "Bickel, Peter J. and Ritov, Ya’acov and Tsybakov, Alexandre B.", +fjournal = "The Annals of Statistics", +journal = "Ann. Statist.", +month = "08", +number = "4", +pages = "1705--1732", +publisher = "The Institute of Mathematical Statistics", +title = "Simultaneous analysis of Lasso and Dantzig selector", +volume = "37", +year = "2009" +} + +@article{raskutti:10, + author = {Garvesh Raskutti and + Martin J. Wainwright and + Bin Yu}, + title = {Restricted Eigenvalue Properties for Correlated Gaussian Designs}, + journal = {Journal of Machine Learning Research}, + volume = {11}, + pages = {2241--2259}, + year = {2010}, + timestamp = {Wed, 15 Oct 2014 17:04:32 +0200}, + bibsource = {dblp computer science bibliography, http://dblp.org} +} + +@article{rudelson:13, + author = {Mark Rudelson and + Shuheng Zhou}, + title = {Reconstruction From Anisotropic Random Measurements}, + journal = {{IEEE} Transactions on Information Theory}, + volume = {59}, + number = {6}, + pages = {3434--3447}, + year = {2013}, + timestamp = {Tue, 21 May 2013 14:15:50 +0200}, + bibsource = {dblp computer science bibliography, http://dblp.org} +} + +@article{bipw11, + author = {Khanh Do Ba and + Piotr Indyk and + Eric Price and + David P. Woodruff}, + title = {Lower Bounds for Sparse Recovery}, + journal = {CoRR}, + volume = {abs/1106.0365}, + year = {2011}, + timestamp = {Mon, 05 Dec 2011 18:04:39 +0100}, + bibsource = {dblp computer science bibliography, http://dblp.org} +} + +@inproceedings{pw11, + author = {Eric Price and + David P. Woodruff}, + title = {{(1} + eps)-Approximate Sparse Recovery}, + booktitle = {{IEEE} 52nd Annual Symposium on Foundations of Computer Science, + {FOCS} 2011, Palm Springs, CA, USA, October 22-25, 2011}, + pages = {295--304}, + year = {2011}, + crossref = {DBLP:conf/focs/2011}, + timestamp = {Tue, 16 Dec 2014 09:57:24 +0100}, + bibsource = {dblp computer science bibliography, http://dblp.org} +} + +@proceedings{DBLP:conf/focs/2011, + editor = {Rafail Ostrovsky}, + title = {{IEEE} 52nd Annual Symposium on Foundations of Computer Science, {FOCS} + 2011, Palm Springs, CA, USA, October 22-25, 2011}, + publisher = {{IEEE} Computer Society}, + year = {2011}, + isbn = {978-1-4577-1843-4}, + timestamp = {Mon, 15 Dec 2014 18:48:45 +0100}, + bibsource = {dblp computer science bibliography, http://dblp.org} +} + +@inproceedings{pw12, + author = {Eric Price and + David P. Woodruff}, + title = {Applications of the Shannon-Hartley theorem to data streams and + sparse recovery}, + booktitle = {Proceedings of the 2012 {IEEE} International Symposium on Information + Theory, {ISIT} 2012, Cambridge, MA, USA, July 1-6, 2012}, + pages = {2446--2450}, + year = {2012}, + crossref = {DBLP:conf/isit/2012}, + timestamp = {Mon, 01 Oct 2012 17:34:07 +0200}, + bibsource = {dblp computer science bibliography, http://dblp.org} +} + +@proceedings{DBLP:conf/isit/2012, + title = {Proceedings of the 2012 {IEEE} International Symposium on Information + Theory, {ISIT} 2012, Cambridge, MA, USA, July 1-6, 2012}, + publisher = {{IEEE}}, + year = {2012}, + isbn = {978-1-4673-2580-6}, + timestamp = {Mon, 01 Oct 2012 17:33:45 +0200}, + bibsource = {dblp computer science bibliography, http://dblp.org} +} + +@article{Leskovec:2010, + author = {Jure Leskovec and + Deepayan Chakrabarti and + Jon M. Kleinberg and + Christos Faloutsos and + Zoubin Ghahramani}, + title = {Kronecker Graphs: An Approach to Modeling Networks}, + journal = {Journal of Machine Learning Research}, + volume = {11}, + pages = {985--1042}, + year = {2010}, + timestamp = {Thu, 22 Apr 2010 13:26:26 +0200}, + bibsource = {dblp computer science bibliography, http://dblp.org} +} + +@article{Holme:2002, + author= {Petter Holme and Beom Jun Kim}, + title = {Growing scale-free networks with tunable clustering}, + journal = {Physical review E}, + volume = {65}, + issue = {2}, + pages = {026--107}, + year = {2002} +} + + +@article{watts:1998, + Annote = {10.1038/30918}, + Author = {Watts, Duncan J. and Strogatz, Steven H.}, + Date = {1998/06/04/print}, + Isbn = {0028-0836}, + Journal = {Nature}, + Number = {6684}, + Pages = {440--442}, + Read = {0}, + Title = {Collective dynamics of `small-world' networks}, + Volume = {393}, + Year = {1998}, +} + +@article{barabasi:2001, + author = {R{\'{e}}ka Albert and + Albert{-}L{\'{a}}szl{\'{o}} Barab{\'{a}}si}, + title = {Statistical mechanics of complex networks}, + journal = {CoRR}, + volume = {cond-mat/0106096}, + year = {2001}, + timestamp = {Mon, 05 Dec 2011 18:05:15 +0100}, + bibsource = {dblp computer science bibliography, http://dblp.org} +} + + +@article{gomezbalduzzi:2011, + author = {Manuel Gomez{-}Rodriguez and + David Balduzzi and + Bernhard Sch{\"{o}}lkopf}, + title = {Uncovering the Temporal Dynamics of Diffusion Networks}, + journal = {CoRR}, + volume = {abs/1105.0697}, + year = {2011}, + timestamp = {Mon, 05 Dec 2011 18:05:23 +0100}, + bibsource = {dblp computer science bibliography, http://dblp.org} +} + +@article{Nowell08, + author = {Liben-Nowell, David and Kleinberg, Jon}, + eprint = {http://www.pnas.org/content/105/12/4633.full.pdf+html}, + journal = {Proceedings of the National Academy of Sciences}, + keywords = {SNA graph networks}, + number = 12, + pages = {4633-4638}, + timestamp = {2008-10-09T10:32:56.000+0200}, + title = {{Tracing information flow on a global scale using Internet chain-letter data}}, + volume = 105, + year = 2008 +} + +@inproceedings{Leskovec07, + author = {Jure Leskovec and + Mary McGlohon and + Christos Faloutsos and + Natalie S. Glance and + Matthew Hurst}, + title = {Patterns of Cascading Behavior in Large Blog Graphs}, + booktitle = {Proceedings of the Seventh {SIAM} International Conference on Data + Mining, April 26-28, 2007, Minneapolis, Minnesota, {USA}}, + pages = {551--556}, + year = {2007}, + timestamp = {Wed, 12 Feb 2014 17:08:15 +0100}, + bibsource = {dblp computer science bibliography, http://dblp.org} +} + + +@inproceedings{AdarA05, + author = {Eytan Adar and + Lada A. Adamic}, + title = {Tracking Information Epidemics in Blogspace}, + booktitle = {2005 {IEEE} / {WIC} / {ACM} International Conference on Web Intelligence + {(WI} 2005), 19-22 September 2005, Compiegne, France}, + pages = {207--214}, + year = {2005}, + timestamp = {Tue, 12 Aug 2014 16:59:16 +0200}, + bibsource = {dblp computer science bibliography, http://dblp.org} +} + +@inproceedings{Kleinberg:00, + author = {Jon M. Kleinberg}, + title = {The small-world phenomenon: an algorithm perspective}, + booktitle = {Proceedings of the Thirty-Second Annual {ACM} Symposium on Theory + of Computing, May 21-23, 2000, Portland, OR, {USA}}, + pages = {163--170}, + year = {2000}, + timestamp = {Thu, 16 Feb 2012 12:06:08 +0100}, + bibsource = {dblp computer science bibliography, http://dblp.org} +} + +@article{zhang2014, + title={Confidence intervals for low dimensional parameters in high dimensional + linear models}, + author={Zhang, Cun-Hui and Zhang, Stephanie S}, + journal={Journal of the Royal Statistical Society: Series B (Statistical + Methodology)}, + volume={76}, + number={1}, + pages={217--242}, + year={2014}, + publisher={Wiley Online Library} +} + +@article{javanmard2014, + title={Confidence intervals and hypothesis testing for high-dimensional + regression}, + author={Javanmard, Adel and Montanari, Andrea}, + journal={The Journal of Machine Learning Research}, + volume={15}, + number={1}, + pages={2869--2909}, + year={2014}, + publisher={JMLR. org} +} + +@article{donoho2006compressed, + title={Compressed sensing}, + author={Donoho, David L}, + journal={Information Theory, IEEE Transactions on}, + volume={52}, + number={4}, + pages={1289--1306}, + year={2006}, + publisher={IEEE} +} + +@article{candes2006near, + title={Near-optimal signal recovery from random projections: Universal + encoding strategies?}, + author={Candes, Emmanuel J and Tao, Terence}, + journal={Information Theory, IEEE Transactions on}, + volume={52}, + number={12}, + pages={5406--5425}, + year={2006}, + publisher={IEEE} +} + +@article{bickel2009simultaneous, + title={Simultaneous analysis of Lasso and Dantzig selector}, + author={Bickel, Peter J and Ritov, Ya'acov and Tsybakov, Alexandre B}, + journal={The Annals of Statistics}, + pages={1705--1732}, + year={2009}, + publisher={JSTOR} +} + +@article{bagnoli2005log, + title={Log-concave probability and its applications}, + author={Bagnoli, Mark and Bergstrom, Ted}, + journal={Economic theory}, + volume={26}, + number={2}, + pages={445--469}, + year={2005}, + publisher={Springer} +} + + |
